石油炼制与化工 ›› 2026, Vol. 58 ›› Issue (7): 89-98.

• 控制与优化 • 上一篇    下一篇

基于数据增强与多模型对比的渣油结焦预测模型研究

王学深1,刘洋1,2,钟梅1,范士广3,俞音1,4,代正华1,5,靳立军1,6,刘洋1,玉散·吐拉甫1,江军1,7   

  1. 1. 新疆大学化工学院
    2. 中国石油独山子石化分公司研究院
    3. 中国石油大学(北京)克拉玛依校区
    4. 新疆生态环境科学研究院
    5. 华东理工大学
    6. 大连理工大学
    7. 新疆中泰矿冶有限公司
  • 收稿日期:2026-01-08 修回日期:2026-02-28 出版日期:2026-07-12 发布日期:2026-06-29
  • 通讯作者: 王学深 E-mail:1873859725@qq.com
  • 基金资助:
    新疆维吾尔自治区重点研发项目;自治区高校基本科研业务费;“天山英才”青年拔尖人才项目;新疆维吾尔自治区自然科学基金项目;国家自然科学基金;“天池英才”青年博士人才项目

RESEARCH ON COKING PREDICTION MODEL OF RESIDUAL OIL BASED ON DATA AUGMENTATION ANDMULTI-MODEL COMPARISON

  • Received:2026-01-08 Revised:2026-02-28 Online:2026-07-12 Published:2026-06-29

摘要: 为了实现对渣油结焦风险的快速评估,融合渣油分子结构参数、理化性质、数据增强与多模型对比,构建了一种渣油结焦预测模型。首先,基于元素组成、残炭、族组成和氢原子类型分布数据,采用Brown-Ladner法计算了渣油分子结构参数,形成渣油结焦特性预测数据集,并通过Spearman相关性分析筛选关键特征变量;进而,采用生成对抗网络对训练集样本数据进行增强,结合Kolmogorov–Smirnov检验优选数据增强倍数;在此基础上,分别建立了预测渣油结焦特性的BP神经网络、高斯核回归(GKR)和随机森林回归(RF)模型,并对比分析其预测性能。结果表明:与渣油结焦率相关性较强的渣油特征参数为芳香碳率、总环数、密度、残炭和胶质含量;最佳数据增强倍数为3倍;3种预测模型中,GKR和RF模型的预测精度和泛化能力均较差,而BP神经网络模型的预测精度最高,其对测试集样本预测结果的平均绝对误差、均方根误差、决定系数分别为0.1235、0.1482、0.8964,且交叉验证结果的误差最小,证实BP模型对渣油结焦预测具有最佳的稳定性与泛化能力。

关键词: 渣油, 结焦特性, 预测模型, 结构参数, 数据增强, BP神经网络

Abstract: To enable rapid assessment of residual oil coking risks, a prediction model for To enable rapid assessment of residual oil coking risks, a prediction model for residual oil coking characteristics was developed by integrating molecular structural parameters, physicochemical properties, data augmentation, and multi-model comparison. First,the molecular structural parameters of residual oil were calculated using the Brown–Ladner method based on elemental composition, carbon residue, SARA components, and hydrogen atom types distribution data. This formed a residual oil coking characteristic prediction dataset, with key feature variables selected via Spearman correlation analysis. Subsequently, training set samples were augmented using generative adversarial network, and the optimal data augmentation factor was determined through Kolmogorov–Smirnov tests. Based on this, back-propagation neural network (BP), Gaussian kernel regression (GKR), and random forest regression (RF) models were established to predict residual oil coking characteristics, with comparative analysis of their performance. The results show that the residual oil feature parameters strongly correlated with coking yields include aromatic carbon ratio, total ring number, density, carbon residue, and resins content. The optimal data augmentation factor was found to be three times. Among the three prediction models, GKR and RF exhibited poorer prediction accuracy and generalization capabilities, while the BP neural network model demonstrated the highest prediction accuracy, with a mean absolute error of 0.1235, root mean square error of 0.1482, and a coefficient of determination of 0.8964. Furthermore, the BP model's cross-validation results exhibited the smallest error, confirming its superior stability and generalization capability.

Key words: residual oil, coking characteristics, prediction model, structural parameter, data augmentation, BP neural network