Evaluasi Model Machine Learning untuk Prediksi Harga Mobil dengan Perbandingan Ensemble dan Regresi Linear

Authors

  • Nur Oktavin Idris Universitas Negeri Gorontalo
  • Fuad Pontoiyo Universitas Negeri Gorontalo

DOI:

https://doi.org/10.70340/jirsi.v4i1.181

Keywords:

prediction, machine learning, gradient boosting, random forest, linear regression

Abstract

Car price prediction is a major challenge in the automotive industry because it is influenced by various factors, such as technical specifications, fuel type, and transmission system. This research aims to evaluate and compare the performance of linear regression models and ensemble learning methods, namely Random Forest and Gradient Boosting, in predicting car prices. The dataset used comes from Kaggle, with 11,914 rows of data and 16 features. The research process includes the stages of data understanding, data preparation, modeling, and evaluation using the Mean Squared Error (MSE) and R-squared (R²) metrics. The research results show that the Gradient Boosting model has the best performance, with an R² value of 0.963868 and the lowest MSE compared to other models, followed by Random Forest with an R² of 0.899657. In contrast, linear regression showed lower performance, with an R² of 0.417905, indicating its limitations in handling non-linear relationships in the data. The prediction results from the best model show price estimates that are quite close to actual prices, although some improvements still need to be made through hyperparameter optimization. This research confirms that ensemble learning methods, especially Gradient Boosting, provide a more effective approach to predicting car prices than linear regression. This model has the potential to be applied in the automotive industry to improve the accuracy of vehicle price estimates for manufacturers, dealers, and consumers.

Downloads

Download data is not yet available.

References

V. Viswanatha, A. C. Ramachandra, B. D. Parameshachari, H. V. Vachan, and S. S. Shetty, “Predicting the Price of used Cars using Machine Learning,” 2023 Int. Conf. Evol. Algorithms Soft Comput. Tech. EASCT 2023, pp. 1–6, 2023, doi: 10.1109/EASCT59475.2023.10393486.

F. A. Alghifari, R. Andreswari, and E. Sutoyo, “Used Cars Price Prediction In Dki Jakarta Using Extreme Gradient Boosting And Bayesian Optimization Algorithm,” Proc. - Int. Conf. Adv. Data Sci. E-Learning Inf. Syst. ICADEIS 2022, pp. 1–5, 2022, doi: 10.1109/ICADEIS56544.2022.10037301.

A. F. Kinadi, R. Andreswari, E. Sutoyo, R. Nugraha, and A. A. B. Kamil, “Used Car Price Prediction in Surabaya Using Random Forest Regressor Algorithms,” Proc. - Int. Conf. Adv. Data Sci. E-Learning Inf. Syst. ICADEIS 2022, pp. 1–4, 2022, doi: 10.1109/ICADEIS56544.2022.10037526.

M. Ahmad et al., “Car Price Prediction using Machine Learning,” 2024 IEEE 9th Int. Conf. Converg. Technol. I2CT 2024, pp. 3–7, 2024, doi: 10.1109/I2CT61223.2024.10544124.

C. Jin, “Price Prediction of Used Cars Using Machine Learning,” Proc. 2021 IEEE Int. Conf. Emerg. Sci. Inf. Technol. ICESIT 2021, pp. 223–230, 2021, doi: 10.1109/ICESIT53460.2021.9696839.

M. Hankar, M. Birjali, and A. Beni-Hssane, “Used Car Price Prediction using Machine Learning: A Case Study,” 11th Int. Symp. Signal, Image, Video Commun. ISIVC 2022 - Conf. Proc., pp. 1–4, 2022, doi: 10.1109/ISIVC54825.2022.9800719.

Y. Li, Y. Li, and Y. Liu, “Research on Used Car Price Prediction Based on Random Forest and LightGBM,” 2022 IEEE 2nd Int. Conf. Data Sci. Comput. Appl. ICDSCA 2022, pp. 539–543, 2022, doi: 10.1109/ICDSCA56264.2022.9988116.

N. O. Idris, A. Achban, S. A. Utiarahman, J. Karim, and F. Pontoiyo, “Predicting the selling price of cars using business intelligence with the feed-forward backpropagation algorithms,” 2020 5th Int. Conf. Informatics Comput. ICIC 2020, 2020, doi: 10.1109/ICIC50835.2020.9288594.

S. Shaprapawad, P. Borugadda, and N. Koshika, “Car Price Prediction:An Application of Machine Learning,” 6th Int. Conf. Inven. Comput. Technol. ICICT 2023 - Proc., no. Icict, pp. 242–248, 2023, doi: 10.1109/ICICT57646.2023.10134142.

L. P. Nasyuli, I. Lubis, and A. M. Elhanafi, “Penerapan Model Machine Learning Algoritma Gradient Boosting dan Linear Regression Melakukan Prediksi Harga Kendaraan Bekas,” J. Ilmu Komput. dan Sist. Informasi(JIRSI), vol. 2, no. 2, pp. 299–310, 2023, doi: 10.70340/jirsi.v2i2.56.

P. H. Putra, A. Azanuddin, B. Purba, and Y. A. Dalimunthe, “Random Forest and Decision Tree Algorithms for Car Price Prediction,” J. Mat. Dan Ilmu Pengetah. Alam LLDikti Wil. 1, vol. 4, no. 1, pp. 81–89, 2023, doi: 10.54076/jumpa.v3i2.305.

D. Miftahul Huda, G. Dwilestari, A. Rizki Rinaldi, and Lin, “Prediksi Harga Mobil Bekas Menggunakan Algoritma Regresi Linear Berganda,” J. Inform. dan Rekayasa Perangkat Lunak, vol. 6, no. 1, pp. 150–157, 2024, doi: 10.36499/jinrpl.v6i1.10266.

A. Testas, “Multiple Linear Regression with Pandas, Scikit-Learn, and PySpark,” in Distributed Machine Learning with PySpark: Migrating Effortlessly from Pandas and Scikit-Learn, Berkeley, CA: Apress, 2023, pp. 53–74. doi: 10.1007/978-1-4842-9751-3_3.

Downloads

Published

2025-01-31