Improving probabilistic record linkage with a single-layer neural network

dc.contributor.authorHamersma, Kris A.
dc.date.accessioned2019-02-04T13:10:23Z
dc.date.available2019-02-04T13:10:23Z
dc.date.created2017
dc.date.issued2017
dc.descriptionMini Dissertation (B Eng. (Industrial and Systems Engineering))--University of Pretoria, 2017.en_ZA
dc.description.abstractData analysis requires data to be of a high quality. Unfortunately this is not always the case, especially when data is extracted from di erent data sources. In the case where there is no unique identi er to match data records from multiple data sources alternative methods need to be developed to match the records. Record linkage attempts to do this primarily with deterministic and probabilistic approaches. Deterministic models depend on certain corresponding elds from each record pair to be identical matches to match the record pair together. Probabilistic methods use a set of equations called the Fellegi- Sunter formulae to calculate decision-making weights, which is used to score a record pair on how well they match. If the matching score is above a certain threshold, the record pair is considered to be a match. This project investigates whether the development of a learning algorithm that re nes the weights will improve the probabilistic model's matching accuracy. The dataset that was used to train and test the record linkage models was a set of 92650 record pairs, some of which were matches and some of which were non-matches. It was found that a learning algorithm did improve the matching accuracy of the probabilistic model, although it is likely that the increase in the number of input features will improve the matching performance even more.en_ZA
dc.format.mediumPDFen_ZA
dc.identifier.urihttp://hdl.handle.net/2263/68389
dc.languageen
dc.language.isoenen_ZA
dc.publisherUniversity of Pretoria. Faculty of Engineering, Built Environment and Information Technology. Dept. of Industrial and Systems Engineeringen_ZA
dc.rights© 2017 University of Pretoria. All rights reserved. The copyright in this work vests in the University of Pretoria. No part of this work may be reproduced or transmitted in any form or by any means, without the prior written permission of the University of Pretoria.en_ZA
dc.subjectMini-dissertations (Industrial and Systems Engineering)en_ZA
dc.titleImproving probabilistic record linkage with a single-layer neural networken_ZA
dc.typeMini Dissertationen_ZA

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Hamersma_Improving_2017.pdf
Size:
1.63 MB
Format:
Adobe Portable Document Format
Description:
Mini Dissertation

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.75 KB
Format:
Item-specific license agreed upon to submission
Description: