Skip to main navigation menu Skip to main content Skip to site footer

Segmentation and estimation of claim severity in motor third-party liability insurance through contrast analysis

Abstract

Research background: Using the marginal means and contrast analysis of the target variable, e.g., claim severity (CS), the actuary can perform an in-depth analysis of the portfolio and fully use the general linear models potential. These analyses are mainly used in natural sciences, medicine, and psychology, but so far, it has not been given adequate attention in the actuarial field.

Purpose of the article: The article's primary purpose is to point out the possibilities of contrast analysis for the segmentation of policyholders and estimation of CS in motor third-party liability insurance. The article focuses on using contrast analysis to redefine individual relevant factors to ensure the segmentation of policyholders in terms of actuarial fairness and statistical correctness. The aim of the article is also to reveal the possibilities of using contrast analysis for adequate segmentation in case of interaction of factors and the subsequent estimation of CS.

Methods: The article uses the general linear model and associated least squares means. Contrast analysis is being implemented through testing and estimating linear combinations of model parameters. Equations of estimable functions reveal how to interpret the results correctly.

Findings & value added: The article shows that contrast analysis is a valuable tool for segmenting policyholders in motor insurance. The segmentation's validity is statistically verifiable and is well applicable to the main effects. Suppose the significance of cross effects is proved during segmentation. In that case, the actuary must take into account the risk that even if the partial segmentation factors are set adequately, statistically proven, this may not apply to the interaction of these factors. The article also provides a procedure for segmentation in case of interaction of factors and the procedure for estimation of the segment's CS. Empirical research has shown that CS is significantly influenced by weight, engine power, age and brand of the car, policyholder's age, and district. The pattern of age's influence on CS differs in different categories of car brands. The significantly highest CS was revealed in the youngest age category and the category of luxury car brands.

Keywords

general linear model, claim severity, motor third party liability insurance, least squares means, contrast analysis

PDF

References

  1. Agresti, A. (2015). Foundations of linear and generalized linear models. New York: John Wiley & Sons.
    View in Google Scholar
  2. Alemany, R., Bolancé, C., Rodrigo, R., & Vernic, R. (2020). Bivariate mixed Poisson and Normal Generalised Linear models with Sarmanov dependence?an application to model claim frequency and optimal transformed average severity. Mathematics, 9(1), 73. doi: 10.3390/math9010073. DOI: https://doi.org/10.3390/math9010073
    View in Google Scholar
  3. Ayuso, M., Guillen, M., & Nielsen, J. P. (2019). Improving automobile insurance ratemaking using telematics: incorporating mileage and driver behaviour data. Transportation, 46(3), 735?752. doi: 10.1007/s11116-018-9890-7. DOI: https://doi.org/10.1007/s11116-018-9890-7
    View in Google Scholar
  4. Bae, J., Kim, Y. Y., & Lee, J. S. (2017). Factors associated with subjective life expectancy: comparison with actuarial life expectancy. Journal of Preventive Medicine and Public Health, 50(4), 240. doi: 10.3961/jpmph.17.036. DOI: https://doi.org/10.3961/jpmph.17.036
    View in Google Scholar
  5. Bergelt, M., Fung Yuan, V., O?Brien, R., Middleton, L. E., & Martins dos Santos, W. (2020). Moderate aerobic exercise, but not anticipation of exercise, improves cognitive control. PloS One, 15(11), e0242270. doi: 10.1371/journal .pone.0242270. DOI: https://doi.org/10.1371/journal.pone.0242270
    View in Google Scholar
  6. Burka, D., Kovács, L., & Szepesváry, L. (2021). Modelling MTPL insurance claim events: can machine learning methods overperform the traditional GLM approach? Hungarian Statistical Review, 4(2), 34?69. doi: 10.35618/hsr2021. 02.en034. DOI: https://doi.org/10.35618/hsr2021.02.en034
    View in Google Scholar
  7. Byrne, K. M., Adler, P. B., & Lauenroth, W. K. (2017). Contrasting effects of precipitation manipulations in two Great Plains plant communities. Journal of Vegetation Science, 28(2), 238?249. doi: 10.1111/jvs.12486. DOI: https://doi.org/10.1111/jvs.12486
    View in Google Scholar
  8. Cai, W. (2014). Making comparisons fair: how LS-means unify the analysis of linear models. SAS Institute Inc. Paper SA, S060-2014.
    View in Google Scholar
  9. Colin, T., Bruce, J., Meikle, W. G., & Barron, A. B. (2018). The development of honey bee colonies assessed using a new semi-automated brood counting method: CombCount. PLoS One, 13(10), e0205816. doi: 10.1371/journal.pone. 0205816. DOI: https://doi.org/10.1371/journal.pone.0205816
    View in Google Scholar
  10. Darlington, R. B., & Hayes, A. F. (2016). Regression analysis and linear models: concepts, applications, and implementation. Guilford Publications.
    View in Google Scholar
  11. David, M. (2015). Auto insurance premium calculation using generalized linear models. Procedia Economics and Finance, 20, 147?156. doi: 10.1016/S2212-5671(15)00059-3. DOI: https://doi.org/10.1016/S2212-5671(15)00059-3
    View in Google Scholar
  12. de Azevedo, F. C., Oliveira, T. A., & Oliveira, A. (2016). Modeling non-life insurance price for risk without historical information. REVSTAT-Statistical Journal, 14(2), 171?192. doi: 10.57805/revstat.v14i2.185.
    View in Google Scholar
  13. de Jong, P., & Heller, G. Z. (2008). Generalized linear models for insurance data. Cambridge Books. DOI: https://doi.org/10.1017/CBO9780511755408
    View in Google Scholar
  14. de Sá, J. P. M. (2007). Applied statistics using SPSS, Statistica, MatLab and R. Springer Science & Business Media.
    View in Google Scholar
  15. Dean, A., Voss, D., & Draguljić, D. (2017). Design and analysis of experiments Springer, Cham. DOI: https://doi.org/10.1007/978-3-319-52250-0
    View in Google Scholar
  16. Duan, Z., Chang, Y., Wang, Q., Chen, T., & Zhao, Q. (2018). A logistic regression based auto insurance rate-making model designed for the insurance rate reform. International Journal of Financial Studies, 6(1), 18. doi: 10.3390/ijfs6010018. DOI: https://doi.org/10.3390/ijfs6010018
    View in Google Scholar
  17. Elswick Jr, R. K., Gennings, C., Chinchilli, V. M., & Dawson, K. S. (1991). A simple approach for finding estimable functions in linear models. American Statistician, 45(1), 51?53. doi: 10.1080/00031305.1991.10475766. DOI: https://doi.org/10.1080/00031305.1991.10475766
    View in Google Scholar
  18. Ennour-Idrissi, K., T?tu, B., Maunsell, E., Poirier, B., Montoni, A., Rochette, P. J., & Diorio, C. (2016). Association of telomere length with breast cancer prognostic factors. PLoS One, 11(8), e0161903. doi: 10.1371/journal.pone.016 1903. DOI: https://doi.org/10.1371/journal.pone.0161903
    View in Google Scholar
  19. Fox, J. (2015). Applied regression analysis and generalized linear models. Sage Publications.
    View in Google Scholar
  20. Frees, E. W., Derrig, R. A., & Meyers, G. (Eds.) (2014). Predictive modeling applications in actuarial science (Vol. 1). Cambridge University Press. DOI: https://doi.org/10.1017/CBO9781139342674.001
    View in Google Scholar
  21. Frees, E. W., Lee, G., & Yang, L. (2016). Multivariate frequency-severity regression models in insurance. Risks, 4(1), 4. doi: 10.3390/risks4010004. DOI: https://doi.org/10.3390/risks4010004
    View in Google Scholar
  22. Fung, T. C., Badescu, A. L., & Lin, X. S. (2021). A new class of severity regression models with an application to IBNR prediction. North American Actuarial Journal, 25(2), 206?231. doi: 10.1080/10920277.2020.1729813. DOI: https://doi.org/10.1080/10920277.2020.1729813
    View in Google Scholar
  23. George, D., & Mallery, P. (2019). IBM SPSS statistics 26 step by step: a simple guide and reference. Routledge. DOI: https://doi.org/10.4324/9780429056765
    View in Google Scholar
  24. Goldburd, M., Khare, A., Tevet, D., & Guller, D. (2016). Generalized linear models for insurance rating. Casualty Actuarial Society, CAS Monographs Series, 5.
    View in Google Scholar
  25. Goodnight, J. H, & Harvey, W. R (1997). SAS technical report R-103. Least Squares Means in the Fixed Effects General Model. Cary, NC: SAS Institute Inc.
    View in Google Scholar
  26. Haans, A. (2018). Contrast analysis: a tutorial. Practical Assessment, Research, and Evaluation, 23(1), 9. doi: 10.7275/7dey-zd62.
    View in Google Scholar
  27. Henckaerts, R., Antonio, K., Clijsters, M., & Verbelen, R. (2018). A data driven binning strategy for the construction of insurance tariff classes. Scandinavian Actuarial Journal, 8, 681?705. doi: 10.1080/03461238.2018.1429300. DOI: https://doi.org/10.1080/03461238.2018.1429300
    View in Google Scholar
  28. Henckaerts, R., Côté, M. P., Antonio, K., & Verbelen, R. (2021). Boosting insights in insurance tariff plans with tree-based machine learning methods. North American Actuarial Journal, 25(2), 255?285. doi: 10.1080/10920277.2020.174 5656. DOI: https://doi.org/10.1080/10920277.2020.1745656
    View in Google Scholar
  29. Henckaerts, R., & Antonio, K. (2022). The added value of dynamically updating motor insurance prices with telematics collected driving behavior data. Insurance: Mathematics and Economics, 105, 79?95. doi: 10.1016/j.insmath eco.2022.03.011. DOI: https://doi.org/10.1016/j.insmatheco.2022.03.011
    View in Google Scholar
  30. Herberich, E., Sikorski, J., & Hothorn, T. (2010). A robust procedure for comparing multiple means under heteroscedasticity in unbalanced designs. PloS one, 5(3), e9788. doi: 10.1371/journal.pone.0009788. DOI: https://doi.org/10.1371/journal.pone.0009788
    View in Google Scholar
  31. Huzar-Novakowiski, J., & Dorrance, A. E. (2018). Genetic diversity and population structure of Pythium irregulare from soybean and corn production fields in Ohio. Plant Disease, 102(10), 1989?2000. doi: 10.1094/PDIS-11-17-1725-RE. DOI: https://doi.org/10.1094/PDIS-11-17-1725-RE
    View in Google Scholar
  32. Kafková, S., & Křivánková, L. (2014). Generalized linear models in vehicle insurance. Acta Universitatis Agriculturae et Silviculturae Mendelianae Brunensis, 62(2), 383?388. doi: 10.11118/actaun201462020383. DOI: https://doi.org/10.11118/actaun201462020383
    View in Google Scholar
  33. Kafková, S. (2015). Bonus-malus systems in vehicle insurance. Procedia Economics and Finance, 23, 216?222. doi: 10.1016/S2212-5671(15)00354-8. DOI: https://doi.org/10.1016/S2212-5671(15)00354-8
    View in Google Scholar
  34. Kim, K., & Timm, N. (2006). Univariate and multivariate general linear models: theory and applications with SAS. Chapman and Hall/CRC. DOI: https://doi.org/10.1201/b15891
    View in Google Scholar
  35. Kim, J. H. (2019). Multicollinearity and misleading statistical results. Korean Journal of Anesthesiology, 72(6), 558. doi: 10.4097/kja.19087. DOI: https://doi.org/10.4097/kja.19087
    View in Google Scholar
  36. Kuznetsova. A., Brockhoff. P. B., & Christensen. R. H. B. (2017). lmerTest package: tests in linear mixed effects models. Journal of Statistical Software. 82(13), 1?26. doi: 10.18637/jss.v082.i13. DOI: https://doi.org/10.18637/jss.v082.i13
    View in Google Scholar
  37. LaMotte, L. R. (2020). A formula for Type III sums of squares. Communications in Statistics-Theory and Methods, 49(13), 3126?3136. doi: 10.1080/03610926.201 9.1586933. DOI: https://doi.org/10.1080/03610926.2019.1586933
    View in Google Scholar
  38. Lee, S., & Lee, D. K. (2018). What is the proper way to apply the multiple comparison test? Korean Journal of Anesthesiology, 71(5), 353. doi: 10.4097/kj a.d.18.00242. DOI: https://doi.org/10.4097/kja.d.18.00242
    View in Google Scholar
  39. Lenth, R., V. (2016). Least-squares means: the R package lsmeans. Journal of Statistical Software, 69(1), 1?33. doi: 10.18637/jss.v069.i01. DOI: https://doi.org/10.18637/jss.v069.i01
    View in Google Scholar
  40. Lenth, R., Buerkner, P., Herve, M., Love, J., Miguez, F., Riebl, H., & Singmann, H. (2022). Estimated marginal means, aka least-squares means. R package ?emmeans?, version 1.7.2. Retrieved from https://cran.r-project.org/web/packag es/emmeans/emmeans.pdf (15.03.2022).
    View in Google Scholar
  41. Littell, R. C., Stroup, W. W., & Freund, R. J. (2010). SAS for linear models. Cary, NC: SAS Institute Inc.
    View in Google Scholar
  42. McFarquhar, M. (2016). Testable hypotheses for unbalanced neuroimaging data. Frontiers in Neuroscience, 10, 270. doi: 10.3389/fnins.2016.00270. DOI: https://doi.org/10.3389/fnins.2016.00270
    View in Google Scholar
  43. O?Brien, R. M. (2014). Estimable functions in age-period-cohort models: a unified approach. Quality & Quantity, 48(1), 457?474. doi: 10.1007/s11135-012-9780-6. DOI: https://doi.org/10.1007/s11135-012-9780-6
    View in Google Scholar
  44. Olivera-La Rosa, A., Chuquichambi, E. G., & Ingram, G. P. (2020). Keep your (social) distance: pathogen concerns and social perception in the time of COVID-19. Personality and Individual Differences, 166, 110200. doi: 10.1016 /j.paid.2020.110200. DOI: https://doi.org/10.1016/j.paid.2020.110200
    View in Google Scholar
  45. Ordaz, J. A., del Carmen Melgar, M., & Khan, M. K. (2011). An analysis of Spanish accidents in automobile insurance: the use of the Probit model and the theoretical potential of other econometric tools. Equilibrium. Equilibrium. Quarterly Journal of Economics and Economic Policy, 6(3), 117?134. doi: 10.12775/EQUIL2011.024. DOI: https://doi.org/10.12775/EQUIL2011.024
    View in Google Scholar
  46. Poline, J. B., Kherif, F., Pallier, C., & Penny, W. (2007). Contrasts and classical inference. In W. D. Penny, K. J. Friston, J. T. Ashburner, S. J. Kiebel & T. E. Nichols (Eds.) (2011). Statistical parametric mapping: the analysis of functional brain images (126?139). Elsevier. DOI: https://doi.org/10.1016/B978-012372560-8/50009-7
    View in Google Scholar
  47. Rafter, J. A., Abell, M. L., & Braselton, J. P. (2002). Multiple comparison methods for means. Siam Review, 44(2), 259?278. doi: 10.1137/S0036144501357233. DOI: https://doi.org/10.1137/S0036144501357233
    View in Google Scholar
  48. Rivers, J. W., Newberry, G. N., Schwarz, C. J., & Ardia, D. R. (2017). Success despite the stress: violet?green swallows increase glucocorticoids and maintain reproductive output despite experimental increases in flight costs. Functional Ecology, 31(1), 235?244. doi: 10.1111/1365-2435.12719. DOI: https://doi.org/10.1111/1365-2435.12719
    View in Google Scholar
  49. Rahardja, D. (2020). Multiple comparison procedures for the differences of proportion parameters in over-reported multiple-sample binomial data. Stats, 3(1), 56?67. doi: 10.3390/stats3010006. DOI: https://doi.org/10.3390/stats3010006
    View in Google Scholar
  50. Quigley, M. Y., Rivers, M. L., & Kravchenko, A. N. (2018). Patterns and sources of spatial heterogeneity in soil matrix from contrasting long term management practices. Frontiers in Environmental Science, 6, 28. doi: 10.3390/stats3010006 DOI: https://doi.org/10.3389/fenvs.2018.00028
    View in Google Scholar
  51. SAS Institute Inc. (2017). The four types of estimable functions. In SAS/STAT? 14.3 User?s Guide. Cary, NC: SAS Institute Inc.
    View in Google Scholar
  52. SAS Institute Inc. (2018). SAS/STAT? 15.1 User?s Guide. The GLM Procedure. Cary, NC: SAS Institute Inc.
    View in Google Scholar
  53. Schad, D. J., Vasishth, S., Hohenstein, S., & Kliegl, R. (2020). How to capitalize on a priori contrasts in linear (mixed) models: a tutorial. Journal of Memory and Language, 110, 104038. doi: 10.1016/j.jml.2019.104038. DOI: https://doi.org/10.1016/j.jml.2019.104038
    View in Google Scholar
  54. Searle, S. R., & Gruber, M. H. J. (2017). Linear models. John Wiley & Sons.
    View in Google Scholar
  55. Searle, S. R., Speed, F. M., & Milliken, G. A. (1980). Population marginal means in the linear model: an alternative to least squares means. American Statistician, 34(4), 216?221. doi: 10.1080/00031305.1980.10483031. DOI: https://doi.org/10.1080/00031305.1980.10483031
    View in Google Scholar
  56. Shi, P., Feng, X., & Ivantsova, A. (2015). Dependent frequency?severity modeling of insurance claims. Insurance Mathematics and Economics, 64, 417?428. doi: 10.1016/j.insmatheco.2015.07.006. DOI: https://doi.org/10.1016/j.insmatheco.2015.07.006
    View in Google Scholar
  57. Singh, N., Wang, C., & Cooper, R. (2015). Role of vision and mechanoreception in bed bug, Cimex lectularius L. behavior. PLoS one, 10(3), e0118855. doi: 10.1371/journal.pone.0118855. DOI: https://doi.org/10.1371/journal.pone.0118855
    View in Google Scholar
  58. Spilbergs, A., Fomins, A., Krastins, M. (2021). Impact of Covid-19 on the dynamics of MTPL insurance premiums and claims paid in Latvia. WSEAS Transactions on Computer Research, 9, 33?42. doi: 10.37394/232018.2021.9.5 DOI: https://doi.org/10.37394/232018.2021.9.5
    View in Google Scholar
  59. Spilbergs, A., Fomins, A., & Krastins, M. (2022). Road traffic accidents risk drivers' analysis ? multivariate modelling based on Latvian motor third party liability insurance data. In D. Tipuric, A. Krajnovic & N. Recker (Eds.). Economic and social development: book of proceedings (pp. 246?264). Varazdin, Croatia: Varazdin Development and Entrepreneurship Agency.
    View in Google Scholar
  60. Statgraphics Technologies Inc. (2017). General linear models. Statgraphics centu-rion 18.
    View in Google Scholar
  61. Staudt, Y., & Wagner, J. (2021). Assessing the performance of random forests for modeling claim severity in collision car insurance. Risks, 9(3), 53. doi: 10.339 0/risks9030053. DOI: https://doi.org/10.3390/risks9030053
    View in Google Scholar
  62. Su, X., & Bai, M. (2020). Stochastic gradient boosting frequency-severity model of insurance claims. PloS one, 15(8), e0238000. doi: 10.1371/journal.pone.0238 000. DOI: https://doi.org/10.1371/journal.pone.0238000
    View in Google Scholar
  63. Suzuki, M., Taniguchi, T., Furihata, R., Yoshita, K., Arai, Y., Yoshiike, N., & Uchiyama, M. (2019). Seasonal changes in sleep duration and sleep problems: a prospective study in Japanese community residents. PLoS One, 14(4), e0215345. doi: 10.1371/journal.pone.0215345. DOI: https://doi.org/10.1371/journal.pone.0215345
    View in Google Scholar
  64. Šoltés, E., Zelinová, S., & Bilíková, M. (2019). General linear model: an effective tool for analysis of claim severity in motor third party liability insurance. Statistics in Transition New Series, 20(4), 13?31, doi: 10.21307/stattrans-2019-032. DOI: https://doi.org/10.21307/stattrans-2019-032
    View in Google Scholar
  65. Tabachnick, B. G., & Fidell, L. S. (2013). Using multivariate statistics. Boston, MA: Pearson.
    View in Google Scholar
  66. Tattar, P. N., Ramaiah, S., & Manjunath, B. G. (2016). A course in statistics with R. John Wiley & Sons. DOI: https://doi.org/10.1002/9781119152743
    View in Google Scholar
  67. Thompson, P. A. (2006). The ?handy-dandy, quick-n-dirty? automated contrast generator-A SAS/IML R ? macro to support the GLM, MIXED, and GENMOD procedures. SUGI 31 Statistics and data Analysis. Retrieved from http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.176.736&rep=rep1&type=pdf (11.12.2021).
    View in Google Scholar
  68. Ugarte, M. D., Militino, A. F., & Arnholt, A. T. (2008). Probability and statistics with R. CRC press. DOI: https://doi.org/10.1201/9781584888925
    View in Google Scholar
  69. Wang, B., Wu, P., Kwan, B., Tu, M. X., & Feng, Ch. (2018). Simpson?s paradox: examples. Shanghai Archives of Psychiatry, 30(2), 139. doi: 10.11919/j.issn.10 02-0829.218026.
    View in Google Scholar
  70. Westfall, P. H., & Tobias, R. D. (2007). Multiple testing of general contrasts: Truncated closure and the extended Shaffer?Royen method. Journal of the American Statistical Association, 102(478), 487?494. doi: 10.1198/0162 14506000001338. DOI: https://doi.org/10.1198/016214506000001338
    View in Google Scholar
  71. Wicklin R. (2018). Generalized inverses for matrices. Retrieved from https://blogs.sas.com/content/iml/2018/11/21/generalized-inverses-for-matrices. html (23.02. 2022).
    View in Google Scholar
  72. Wilcox, R. R. (2003). Applying contemporary statistical techniques. Elsevier.
    View in Google Scholar
  73. Wooldridge, J. M. (2013). Introductory econometrics: a modern approach. Mason: South-Western.
    View in Google Scholar
  74. Zahi, J. (2021). Non-life insurance ratemaking techniques. International Journal of Accounting, Finance, Auditing, Management and Economics, 2(1), 344?361. doi: 10.5281/zenodo.4474479.
    View in Google Scholar
  75. Zhao, J., Wang, C., Totton, S. C., Cullen, J. N., & O?Connor, A. M. (2019). Reporting and analysis of repeated measurements in preclinical animals experiments. PloS one, 14(8), e0220879. doi: 10.1371/journal.pone.0220879. DOI: https://doi.org/10.1371/journal.pone.0220879
    View in Google Scholar

Similar Articles

41-50 of 422

You may also start an advanced similarity search for this article.