China’s scientific and technological innovation "multiplied"

Helping cutting-edge research, supporting technological innovation, and promoting the integration of digital intelligence-

China’s scientific and technological innovation "multiplied"

In today’s era, data has become an important innovation factor. The big model of artificial intelligence, the creation of new materials, biological breeding and basic scientific research are all inseparable from the support of data.

The National Data Bureau and other 17 departments recently issued the "Three-year Action Plan for Data Elements ×" (2024-2026), which clearly launched the "Data Elements× Scientific and Technological Innovation" action, and expounded the data from the aspects of promoting the orderly and open sharing of scientific data, strengthening the construction and scene application of high-quality scientific data resources, supporting frontier research with scientific data, supporting technological innovation with scientific data, supporting large-scale model development with scientific data, and exploring new research paradigms.

From supporting basic research, to helping the development of cutting-edge technologies such as artificial intelligence, and then to promoting the reform of scientific research methods, China’s scientific and technological innovation is multiplying by the "East Wind" of the three-year action plan of "Data Elements ×".

Build "software and hardware"

It is one of the important goals of the "Data Elements × Science and Technology Innovation" action to share the dynamic scientific data in an orderly and open manner, to promote the interconnection of various scientific data generated by major scientific and technological infrastructure and major scientific and technological projects, to support and cultivate the construction of scientific databases with international influence, and to strengthen the construction of high-quality scientific data resources and scene application by relying on platforms such as the National Science Data Center.

Relevant "software and hardware" have been built, and all localities are actively deploying them.

In 2024, Beijing will promote a number of major projects such as computing centers, data training bases and national blockchain hub nodes.

Jiangsu will systematically promote the scale deployment of 5G and Gigabit optical networks, support the construction of Suzhou national Internet backbone direct connection points, and accelerate the layout of computing facilities such as intelligent computing power and edge computing.

Sichuan proposes to moderately advance the construction of digital information infrastructure, speed up the construction of the national hub node of the "east-to-west calculation" project, build the province’s computing power dispatching service platform, and build an integrated computing network development system with computing power, storage power and transportation capacity.

Shandong proposes to deploy high-performance intelligent computing centers, and make overall arrangements for general and vertical large-scale model computing power. More than 25 provincial-level new data centers with a level of 5A have been built, with the proportion of intelligent computing power reaching 30%, and the "Shandong Computing Network" has been built. Support Jining to build Lunan Computing Center. In-depth implementation of the "Double Gigabit" network system project, creating more than 500 typical application projects and opening 40,000 new 5G base stations.

"Hardware" facilities are strengthened, and "software" facilities also need to be upgraded.

"The Internet is a platform for data circulation and convergence, and it is the key to infrastructure in the digital economy era." Mei Hong, an academician of China Academy of Sciences, said that it is necessary to speed up the construction of new infrastructure such as digital networking and data space.

In 2021, China Academy of Sciences released an open and universal platform for storing and publishing scientific data with international service capability-ScienceDB.

The Scientific Data Bank, independently developed by the Computer Network Information Center of Chinese Academy of Sciences, is a paper-related data storage platform, which can provide efficient solutions for the aggregation, management, opening and sharing of paper-related data, and provide a platform and service guarantee for implementing scientific research integrity, cultivating sharing culture, accelerating data circulation and promoting international cooperation.

Scientific researchers can store and publish their collected scientific data in the scientific data bank, and the scientific data bank can concentrate the data resources scattered in individuals and collectives by absorbing "data deposits", "turning small money into big money and turning dead money into living money", making it easier to be discovered, accessed, interoperated and reused. At the same time, before academic papers are submitted, researchers can also upload the paper data to the scientific data bank.

As of February 2 this year, the Scientific Data Bank has collected more than 8.2 million open data sets, with more than 700 million platform visits.

Develop a large model

Developing artificial intelligence model is another important goal of "data elements × scientific and technological innovation".

The "Three-year Action Plan for Data Elements ×" (2024-2026) proposes that the development of large models should be supported by scientific data, and all kinds of scientific data and scientific literature should be deeply excavated. Through fine-grained knowledge extraction and multi-source knowledge fusion, the base of scientific knowledge resources should be built, and high-quality corpora and basic scientific data sets should be built to support the development and training of artificial intelligence large models.

In recent years, China has a good computing power foundation and a broad market in the field of large models, and domestic large models frequently appear and accelerate iteration. According to the data of CCID Research Institute of the Ministry of Industry and Information Technology, there are more than 19 large language model research and development manufacturers in China, among which 15 manufacturers’ model products have been put on record.

With the general abilities of language understanding, logical reasoning, knowledge question-and-answer and text generation, these large language model products are welcomed by users once they are launched.

"Scientific and technological innovation has achieved new breakthroughs. The iFLYTEK Spark Cognitive Model is at the national leading level. " This is a sentence written in this year’s "Government Work Report" of Anhui Province.

IFLYTEK Spark is a new generation cognitive model officially released by Iflytek Company in May, 2023. Since its release, it has experienced many iterations, constantly upgrading the technical base of core competence and continuously empowering all walks of life. At present, iFLYTEK Spark has been affirmed in many evaluations by the National Research Institute of the State Council Development Research Center and the China Enterprise Development Research Center of Xinhua News Agency Research Center, and is known as a high-quality domestic model in China.

"Only by building the big model on a completely autonomous and controllable platform can we firmly grasp the development initiative in the era of general artificial intelligence in our own hands." Liu Cong, president of Iflytek Research Institute, told this reporter. In October, 2023, at the Iflytek Global 1024 Developers Festival, Iflytek announced that it would cooperate with Huawei to build the "Feixing No.1" platform for the domestic large-scale model computing base. On this basis, the iFLYTEK Spark model started a larger-scale training.

At present, iFLYTEK Spark V3.5 based on "Feixing No.1" has completed the training and was released on January 30th. The upgraded iFLYTEK Spark V3.5 has significantly improved its abilities in logical reasoning, language understanding, text generation, mathematical answering, multi-modal and so on. At the same time, iFLYTEK also released the Spark Voice Model and the Open Source Model.

"The big model has brought new opportunities for the development of voice technology." Liu Cong said. Making machines have the ability of learning, reasoning and decision-making is the main task of cognitive model.

"We believe that there may be the following four trends in the development of artificial intelligence models in the future." Liu Cong told this reporter, "The first is multimodal and multilingual. From the perspective of the future development of general artificial intelligence, cognitive intelligence model is the core foundation. Based on this, other data such as voice, image and video can be aligned into a unified semantic space, and multi-modal system presentation can be realized by combining plug-in tools. The second is believable and explainable. This needs to ensure the quality of the source of massive data, the capability of the large model itself and the continuous optimization iteration of the system scheme, plus the escort of regulatory policies and laws and regulations issued by the state. The third is to develop in the direction of systematic innovation. There are precedents of products and applications in the field of AI (Artificial Intelligence) based on the combination of single-point technologies. With the support of large-scale model capabilities, we need to combine a variety of superior technologies to carry out systematic innovation and pay attention to the moat effect it brings. The fourth is the all-localization development of software and hardware. At present, iFLYTEK has invested and deeply participated in the ecological construction of domestic AI chip software, and has made certain gains and progress on the training side and reasoning side. "

Promote the integration of numbers and intelligence

Intelligent retrieval, keyword screening, and access to the latest medical information … With the convenient functions brought by big data and artificial intelligence, users can easily experience these services with their fingers. In October, 2023, the standard Yunxiang Station of Taizhou City, Jiangsu Province was officially launched, providing enterprises with genuine, real-time updated and more user-friendly information service support with millions of standard data.

This is an innovative practice of Taizhou to promote the deep integration of digital and intelligent technologies and standards. Throughout Taizhou’s medical and health industry, from online to offline, from "laboratory" to "workshop", the achievements of "intelligent transformation and digital transformation" have moved towards "production line".

Entering the small-volume injection production workshop of Jiangsu Datong Pharmaceutical Co., Ltd. in Taizhou Pharmaceutical High-tech Zone (gaogang district), the automatic production line operates in an orderly manner, and the drug production is efficient, accurate and stable; Through a series of measures such as "replacing machines" and integrated management of information systems, the production plants under Yangzijiang Pharmaceutical Group make the whole process of production more intelligent and digital. Jiangsu Longfengtang Chinese Medicine Co., Ltd. has formed a complete set of modern solutions from pre-treatment to extraction of Chinese herbal medicines, and has created a model of "changing wisdom into several turns" in the field of standardization construction of intelligent manufacturing of Chinese herbal medicines.

The practice in Taizhou shows that at present, data has become an important factor in the medical and health industry, and digital technology has also become a necessary tool for the innovative development of biomedicine. The integration of mathematics and intelligence is indispensable to the innovation and development of the medical and health industry.

In the action of "data elements × scientific and technological innovation", promoting the integration of digital intelligence is also one of the important measures. The "Three-year Action Plan for Data Elements ×" (2024-2026) proposes that scientific data should support technological innovation, focus on biological breeding, new material creation, drug research and development, and accelerate technological innovation and industrial upgrading through the integration of numbers and intelligence.

In recent years, the integration, iteration and diffusion of a new generation of digital intelligence technologies, such as artificial intelligence, blockchain, deep learning, and Internet of Things, have penetrated into R&D, design, manufacturing, customer service and other links, bringing all-round, all-round and full-chain transformation to production technologies and production methods, comprehensively improving the automation, digitalization and intelligence level of the industry, and providing a key driving force for promoting the integration of digital intelligence applications and accelerating the formation of new quality productivity.

Qian Xiaojing, a professor at the School of Economics and Management of Northwest University, believes that data elements, as a new key production factor stored and processed by computer equipment in intangible form in the digital economy era, have the technical and economic characteristics of non-competitiveness, low copying cost, non-exclusiveness and strong externality, and can be reused by different subjects, generating the fusion value of "data+algorithm+computing power" through deconstruction and reorganization, convergence and integration, which provides basic and important resources for consolidating the internal form of digital intelligence integration and accelerating the formation of new quality productivity.

"On the one hand, the organic integration of data elements and traditional production factors has enriched the expression of digital intelligence integration and changed the way of digital intelligence integration from geographical space to digital space." Qian Xiaojing said, "On the other hand, data elements play the role of’ media’ in the reconfiguration and recombination of traditional production factors, resulting in a new factor processing model and a new form of digital intelligence integration."