The image, speech sound and text (language) information which are closely related to the human visual and auditory perception play important roles in the society, economy, national security and other fields, and they will remain the rapid growth in the coming period of time. Such information can be directly perceived and understood by human, and they can also be processed with computer, however the computer’s processing ability is far less than that of human, and its processing efficiency can not meet the development requirement of today’s society. How to use the human cognitive mechanism and the latest research achievements of relevant mathematic for reference to establish new computational models and methods will significantly improve the computer’s comprehension ability and processing efficiency to this kind of information. Furthermore, it not only can strongly promote the rapid development of information science and will also make significant contributions to economic and social development.
1. Scientific Targets
Focusing on major national demands and making important contributions to ensure the national security and public safety, to promote the development of information services and related industries and to raise the level of national life and health, the overall scientific targets of this Major Research Plan are to study and construct new computational models and methods based on the human visual and auditory cognitive mechanism and by giving full play to the advantages of interdisciplinary of information sciences, life sciences and mathematical sciences to improve the computer’s comprehension ability of unstructured visual and auditory perception information and its processing efficiency of massive heterogeneous information and overcome the bottleneck difficulties in the images, speech and text (language) information processing. The expecting progresses are as follows: the important progresses will be made in the basic theory research of visual and auditory information processing; the major breakthroughs will be achieved in three key technologies such as the collaborative computing of visual and auditory information, the Chinese understanding and the brain-computer interface related to the visual and auditory perception; and the unmanned vehicle verification platform with the perception ability to natural environment and the intelligent decision-making capability will be developed by integrating the related research achievements mentioned above, where its main performance indicators should reach the world advanced level. Therefore, the purpose of this Major Research Plan is to enhance
’s overall research strength in the field of visual and auditory information processing, to cultivate outstanding talents and teams with international influence and to provide the research environments and technical supports for the national security and social development.
2. Key Scientific Issues
Focusing on the key scientific issues such as “perceptual feature extraction, expression and integration”, “machine learning and understanding of perception data” and “collaborative computing of multi-modal information”, this Major Research Plan will organize and implement the following research works in four main aspects.
1) Image and Visual Information Computing
To mainly study the cognitive mechanism of image and visual information computing, the extraction and selection of visual basic feature, the object recognition and the understanding of image content, the behavior analysis of moving object in complex scenes and so on. To propose some high-performance computational models of images and visual information, to obtain internationally recognized and original research achievements (high-level papers should be published in journals such as Nature, Science, IEEE Trans. PAMI, etc.), and to cultivate outstanding talents and research teams with international influence.
2) Computing of Speech Sound and Auditory Information
To mainly study the mechanism of auditory perception and the scene analysis of speech, the speech recognition in the natural environment and the high-performance speech synthesis, the analysis and understanding of oral dialogue and so on. To obtain internationally influential and original research results, to propose some effective computational models of speech and auditory information, to publish high-level papers in internationally authoritative journals in this field, and to cultivate outstanding talents and research teams with international influence.
3) Natural Language (Chinese) Understanding
To mainly study the cognitive mechanism of language processing, the modeling of language knowledge and the computational models for semantics, the network-oriented moderate understanding model of Chinese and the tools for serial analysis, the key techniques that support the analysis, recognition and understanding of oral dialogue in the natural environment and so on. Based on the existing and related results in the domestic, to comprehensively construct large-scale and high-standard Chinese semantic knowledge base. To apply the aforementioned research results to the typical language (Chinese) information processing system to significantly improve the comprehension ability of natural language (sentences, paragraphs, chapters), and to achieve verification in the network-based information retrieval, filtering and knowledge acquisition.
4) Collaborative Computing of Multi-Modal Information and Brain-Computer Interface
To mainly study the cognitive mechanism and computational models of collaboration of the multi-modal perception information, the pattern recognition and the environment interaction based on the fusion of visual and auditory information, the cross-modal video information retrieval and the sensitive information filtering over network and so on. To significantly improve the precision of cross-modal video information retrieval, and to remarkably enhance the overall research strength in this field.
To study the methods and techniques for extraction of brain signal, localization of brain regions and network analysis of brain function related to the visual and auditory cognition, the techniques of signal transmission, processing and control in brain-computer interaction, and typical applications of brain-computer interface related to the visual and auditory cognition. To be verified or applied in the aspects such as the improvement of life quality and the functional rehabilitation of the persons with disabilities, and to provide new techniques for extending and improving the human’s ability of behavior control.
3. Key Technologies and Integration and Verification Platform
Based on the research works mentioned above, this Major Research Plan will further study and develop the key technologies and the integration and verification platform related to the visual and auditory information processing.
1) Key Technology of Collaborative Computing of Visual and Auditory Information
To study the machine’s collaborative computational models of visual and auditory information and the techniques of system realization, the techniques for pattern recognition based on the fusion of visual and auditory information and the corresponding verification system, and the techniques of cross-modal video information retrieval and the sensitive information filtering over network and their applications. To make the precision of video information retrieval over network to be higher 5%-10% than the best level of foreign countries in the same period by using the computational models of multi-modal collaboration, and to be verified the areas such as network information security and services.
2) Key Technology of Natural Language (Chinese) Understanding
To study the standardized semantic knowledge base of common vocabulary of Chinese and its construction techniques, the realization techniques of the network-oriented moderate understanding model of Chinese and the tools of serial analysis, and the key techniques supporting the analysis, recognition and understanding of oral dialogue in the natural environment. To comprehensively construct the semantic knowledge base of Chinese based on the existing related results in the domestic, where the size of common vocabulary of Chinese will be not less than 50 thousands words, and the size of Chinese balanced corpus base with semantic labeling will be not less than 10 million words. To be applied in the Chinese processing system under the network environment, where the accurate rates of information retrieval and knowledge acquisition should be significantly improved than that with the best available technique.
3) Key Technology of Brain-Computer Interface Related to Visual and Auditory Cognition
To study the techniques for extraction of brain signal, localization of brain regions and network analysis of brain function related to the visual and auditory cognition, the techniques of signal transmission, processing and control in brain-computer interaction and the system realization, and typical applications of brain-computer interface related to the visual and auditory cognition. The proposed information extraction and analysis techniques of non-invasive brain-computer interface should have the international leading level in the same period and should be verified or applied in the aspects such as the improvement of life quality and the functional rehabilitation of the persons with disabilities.
4) Integration and Verification Platform of Unmanned Vehicle
By integrating the related research achievements of the aforementioned basic theories and key technologies and combining the traditional model of visual computing with the new visual cognitive models, to achieve new breakthroughs in environment perception and modeling; to realize the information fusion with multi-sensor, cross-modal and cross-scale, to generate high-quality and three-dimensional map of scenes cognition, and to construct the high-performance verification platform of unmanned and intelligent vehicle; to provide the new key technology of intelligence-assisted safe driving based on the comprehensive analysis of people-car-road state, and to be verified or applied in the defense, intelligence-assisted safe driving and other related fields with important impact.
4. Project Grants
The implementation period of this Major Research Plan is 8 years (from August 2008 to December 2015), and the total funding is 150 million RMB Yuan.
This Major Research Plan will fund the applications in forms of the “fostering projects”, “key funding projects” and “integrated projects”, where their grant intensities and goals are different. The application that has good innovative and academic idea and research value but still requires further exploration will be granted as the “fostering project” (about 500 thousand RMB Yuan/project), and the application that has better innovative and academic idea and research value, good research foundation and achievement accumulation, and great contribution to the overall targets of this research plan will be granted as the “key funding project” (about 3 million Yuan/project). The application that has a decisive role to the realization of the overall goal of research plan will be granted as “integrated project” with greater funding intensity (about 10 million to 16 million RMB Yuan/project). According to the annual progress or inspection results of project implementation, this Major Research Plan will be allowed to appropriately adjust the funding of the approved projects (suspension of the project or additional funding).