基于虚拟样本生成的分子水平汽油使用过程碳排放建模研究

石油炼制与化工 ›› 2024, Vol. 55 ›› Issue (10): 24-31.

基于虚拟样本生成的分子水平汽油使用过程碳排放建模研究

宋建,崔晨,郭莘,田华宇,韩璐,周祥

中石化石油化工科学研究院有限公司

收稿日期:2024-01-26 修回日期:2024-07-03 出版日期:2024-10-12 发布日期:2024-09-26
通讯作者: 崔晨 E-mail:cuichen.ripp@sinopec.com

MODELING OF CARBON EMISSION FROM GASOLINE UTILIZATION AT MOLECULAR LEVEL BASED ON VIRTUAL SAMPLE GENERATION

Received:2024-01-26 Revised:2024-07-03 Online:2024-10-12 Published:2024-09-26

摘要/Abstract

摘要： 在碳达峰、碳中和的战略背景下，汽油作为高碳排放行列的一员，面临着CO₂减排的挑战。基于气相色谱得到的汽油组成数据和通过新欧洲驾驶循环得到的汽油CO₂排放量数据，按照PONA组成、碳原子数和取代基个数对汽油组分进行分类整理，采用层次聚类方法对汽油组成数据进行聚类，并按聚类结果划分训练集和测试集，建立了燃油汽车行驶每千米CO₂排放量的先验模型，旨在为生产低碳排放汽油提供数据支撑。由于数据样本范围较小且比较集中，先验模型在预测CO₂排放时适用性较差，因此提出基于半径近邻分类的多分布整体趋势扩散技术（RNC-MD-MTD）并以此方法生成虚拟样本。结果表明，随着RNC-MD-MTD方法生成的虚拟样本加入，模型的预测精度得到了有效提升，证明了该方法的有效性，最终建立的燃油汽车行驶每100 km CO₂排放预测模型的决定系数为0.98，平均绝对百分比误差为0.29%，均方根误差为792.6 mg/km。

关键词: 汽油组分, CO₂排放, 虚拟样本, 半径近邻分类

Abstract: Under the strategic background of carbon peaking and carbon neutrality, gasoline, as a member of the high carbon emission ranks, faces the challenge of emission reduction. Based on the gasoline molecular composition data obtained by gas chromatography and the gasoline CO₂ emission data obtained by the New European Driving Cycle,a priori model for the relationship between gasoline and the CO₂ emission per kilometer by categorizing the gasoline components was established according to the PONA composition, the number of carbon atoms and the number of substituents, and using hierarchical clustering method to cluster the gasoline molecular composition data, and dividing the training set and test set according to the clustering result, in order to provide data support for the production of low carbon emission gasoline. The priori model of gasoline and CO₂ emission per kilometer was established, aiming to provide data support for the production of low-carbon emission gasoline. Due to the small and concentrated range of data samples, the priori model has poor applicability in predicting CO₂ emissions. Therefore, the multi-distribution overall trend diffusion technique based on radius nearest neighbor classification (RNC-MD-MTD) was proposed and virtual samples were generated by this method. The calculation results showed that the prediction accuracy of the model was effectively improved with the addition of virtual samples generated by the RNC-MD-MTD method, which proved the validity of the method, and the final prediction model for CO₂ emission running per kilometer had a decision coefficient of 0.98, a mean absolute percentage error of 0.29% and a root-mean-square error of 792.6 mg/km.

Key words: gasoline components, carbon dioxide emission, virtual sample, radius neighbor classifier

宋建崔晨郭莘田华宇韩璐周祥. 基于虚拟样本生成的分子水平汽油使用过程碳排放建模研究[J]. 石油炼制与化工, 2024, 55(10): 24-31.

参考文献

[1] International Energy Agency. CO2 emissions by energy source[EB/OL]. https://www.iea.org/countries/china. [2] International Energy Agency. Total CO2 emissions[EB/OL]. https://www.iea.org/countries/china. [3] International Energy Agency. Oil products final consumption by sector, People's Republic of China 1990-2020[EB/OL]. https://www.iea.org/countries/china. [4] 《中国统计年鉴2023》[S]. 2023. [5] 中华人民共和国生态环境部. 中国移动源环境管理年报[R]. [6] 汪燮卿. 中国炼油技术(第4版)[M]. 中国石化出版社, 2021. [7] Center J P E. Gasoline WG report[EB/OL]. [1.8]. https://www.pecj.or.jp/japanese/jcap/jcap2/pdf/4th/2_1.pdf. [8] Zhu R, Hu J, Bao X, et al. Effects of aromatics, olefins and distillation temperatures (T50 & T90) on particle mass and number emissions from gasoline direct injection (GDI) vehicles[J]. Energy Policy, 2017,101:185-193. [9] Karavalakis G, Short D, Vu D, et al. Evaluating the Effects of Aromatics Content in Gasoline on Gaseous and Particulate Matter Emissions from SI-PFI and SIDI Vehicles[J]. Environmental Science & Technology, 2015,49(11):7021-7031. [10] Wei J, Yin Z, Qian Y, et al. Comparative Effects of Olefin Content on the Performance and Emissions of a Modern GDI Engine[J]. Energy & Fuels, 2019,33(11):10499-10507. [11] Mohammed M K, Balla H H, Al-Dulaimi Z M H, et al. Effect of ethanol-gasoline blends on SI engine performance and emissions[J]. Case Studies in Thermal Engineering, 2021,25:100891. [12] Do?an B, Erol D, Yaman H, et al. The effect of ethanol-gasoline blends on performance and exhaust emissions of a spark ignition engine through exergy analysis[J]. Applied Thermal Engineering, 2017,120:433-443. [13] Wang M Q. Development and use of the GREET model to estimate fuel-cycle energy use and emissions of various transportation technologies and fuels[J]. office of scientific & technical information technical reports, 1996. [14] Poggio T, Vetter T. Recognition and Structure from One 2D Model View: Observations on Prototypes, Object Classes and Symmetries[J]. laboratory massachusetts institute of technology, 1992. [15] Li D, Wu C, Tsai T, et al. Using mega-trend-diffusion and artificial samples in small data set learning for early flexible manufacturing system scheduling knowledge[J]. Computers & Operations Research, 2007,34(4):966-982. [16] Zhu B, Chen Z, Yu L. A novel mega-trend-diffusion for small sample[J]. CIESC Journal, 2016,67(3):820-826. [17] 高克铉, 李志刚, 徐长明, 等. 混合整体趋势扩散的虚拟样本构建及其血液光谱分析应用[J]. 仪器仪表学报, 2019,40(08):167-175. [18] Chen Z, Zhu B, He Y, et al. A PSO based virtual sample generation method for small sample sets: Applications to regression datasets[J]. Engineering Applications of Artificial Intelligence, 2017,59:236-243. [19] 周志华. 机器学习[M]. 清华大学出版社, 2016. [20] 李航. 统计学习方法[M]. 清华大学出版社, 2012. [21] Li D C, Wen I H. A genetic algorithm-based virtual sample generation technique to improve small data set learning[J]. Neurocomputing, 2014,143(nov.2):222-230.