石油炼制与化工 ›› 2024, Vol. 55 ›› Issue (10): 24-31.

• 基础研究 • 上一篇    下一篇

基于虚拟样本生成的分子水平汽油使用过程碳排放建模研究

宋建,崔晨,郭莘,田华宇,韩璐,周祥   

  1. 中石化石油化工科学研究院有限公司
  • 收稿日期:2024-01-26 修回日期:2024-07-03 出版日期:2024-10-12 发布日期:2024-09-26
  • 通讯作者: 崔晨 E-mail:cuichen.ripp@sinopec.com

MODELING OF CARBON EMISSION FROM GASOLINE UTILIZATION AT MOLECULAR LEVEL BASED ON VIRTUAL SAMPLE GENERATION


  • Received:2024-01-26 Revised:2024-07-03 Online:2024-10-12 Published:2024-09-26

摘要: 在碳达峰、碳中和的战略背景下,汽油作为高碳排放行列的一员,面临着CO2减排的挑战。基于气相色谱得到的汽油组成数据和通过新欧洲驾驶循环得到的汽油CO2排放量数据,按照PONA组成、碳原子数和取代基个数对汽油组分进行分类整理,采用层次聚类方法对汽油组成数据进行聚类,并按聚类结果划分训练集和测试集,建立了燃油汽车行驶每千米CO2排放量的先验模型,旨在为生产低碳排放汽油提供数据支撑。由于数据样本范围较小且比较集中,先验模型在预测CO2排放时适用性较差,因此提出基于半径近邻分类的多分布整体趋势扩散技术(RNC-MD-MTD)并以此方法生成虚拟样本。结果表明,随着RNC-MD-MTD方法生成的虚拟样本加入,模型的预测精度得到了有效提升,证明了该方法的有效性,最终建立的燃油汽车行驶每100 km CO2排放预测模型的决定系数为0.98,平均绝对百分比误差为0.29%,均方根误差为792.6 mg/km。

关键词: 汽油组分, CO2排放, 虚拟样本, 半径近邻分类

Abstract: Under the strategic background of carbon peaking and carbon neutrality, gasoline, as a member of the high carbon emission ranks, faces the challenge of emission reduction. Based on the gasoline molecular composition data obtained by gas chromatography and the gasoline CO2 emission data obtained by the New European Driving Cycle,a priori model for the relationship between gasoline and the CO2 emission per kilometer by categorizing the gasoline components was established according to the PONA composition, the number of carbon atoms and the number of substituents, and using hierarchical clustering method to cluster the gasoline molecular composition data, and dividing the training set and test set according to the clustering result, in order to provide data support for the production of low carbon emission gasoline. The priori model of gasoline and CO2 emission per kilometer was established, aiming to provide data support for the production of low-carbon emission gasoline. Due to the small and concentrated range of data samples, the priori model has poor applicability in predicting CO2 emissions. Therefore, the multi-distribution overall trend diffusion technique based on radius nearest neighbor classification (RNC-MD-MTD) was proposed and virtual samples were generated by this method. The calculation results showed that the prediction accuracy of the model was effectively improved with the addition of virtual samples generated by the RNC-MD-MTD method, which proved the validity of the method, and the final prediction model for CO2 emission running per kilometer had a decision coefficient of 0.98, a mean absolute percentage error of 0.29% and a root-mean-square error of 792.6 mg/km.

Key words: gasoline components, carbon dioxide emission, virtual sample, radius neighbor classifier