Show simple item record

dc.contributor.authorVan der Schraelen, Lennert
dc.contributor.authorStouthuysen, Kristof
dc.contributor.authorVanden Broucke, Seppe
dc.contributor.authorVerdonck, Tim
dc.date.accessioned2023-04-12T07:53:17Z
dc.date.available2023-04-12T07:53:17Z
dc.date.issued2023en_US
dc.identifier.issn0020-0255
dc.identifier.doi10.1016/j.ins.2023.03.146
dc.identifier.urihttp://hdl.handle.net/20.500.12127/7237
dc.description.abstractIn numerous binary classification tasks, the two groups of instances are not equally represented, which often implies that the training data lack sufficient information to model the minority class correctly. Furthermore, many traditional classification models make arbitrarily overconfident predictions outside the range of the training data. These issues severely impact the deployment and usefulness of these models in real life. In this paper, we propose the boundary regularizing out-of-distribution (BROOD) sampler, which adds artificial data points on the edge of the training data. By exploiting these artificial samples, we are able to regularize the decision surface of discriminative machine learning models and make more prudent predictions. Next, it is crucial to correctly classify many positive instances in a limited pool of instances that can be investigated with the available resources. By smartly assigning predetermined nonuniform class probabilities outside the training data, we can emphasize certain data regions and improve classifier performance on various material classification metrics. The good performance of the proposed methodology is illustrated in a case study that consists of both benchmark balanced and imbalanced classification data sets.en_US
dc.language.isoenen_US
dc.publisherElsevier Science Inc.en_US
dc.subjectBinary Classificationen_US
dc.subjectRegularizationen_US
dc.subjectSamplingen_US
dc.subjectData Imbalanceen_US
dc.titleRegularization oversampling for classification tasks: To exploit what you do not knowen_US
dc.identifier.journalInformation Sciencesen_US
dc.source.volume635en_US
dc.source.issueJulyen_US
dc.source.beginpage169en_US
dc.source.endpage194en_US
dc.contributor.departmentDepartment of Accounting, Finance and Insurance, Faculty of Economics and Business, KU Leuven, Naamsestraat 69, 3000 Leuven, Belgiumen_US
dc.contributor.departmentDepartment of Business Informatics and Operations Management, Faculty of Economics and Business Administration, UGhent, Tweekerkenstraat 2, 9000 Ghent, Belgiumen_US
dc.contributor.departmentResearch Center for Information Systems Engineering (LIRIS), Faculty of Economics and Business, KU Leuven, Naamsestraat 69, 3000 Leuven, Belgiumen_US
dc.contributor.departmentDepartment of Mathematics, Faculty of Science, UAntwerpen, Middelheimlaan 1, 2020 Antwerp, Belgiumen_US
dc.contributor.departmentDepartment of Mathematics, Faculty of Science, KU Leuven, Celestijnenlaan 200B, 3001 Leuven, Belgiumen_US
dc.identifier.eissn1872-6291
vlerick.knowledgedomainAccounting & Financeen_US
vlerick.typearticleJournal article with impact factoren_US
vlerick.vlerickdepartmentAFen_US
dc.identifier.vperid286357en_US
dc.identifier.vperid119751en_US


Files in this item

Thumbnail
Name:
Publisher version

This item appears in the following Collection(s)

Show simple item record