Optical character recognition (OCR) involves converting images of text, handwritten or printed, into a format that can be understood by a machine. Taking things one step further to comprehend this text in a way that is similar to how the human brain does this has multiple real-world applications, address recognition being one.
PostNord Retail is an app used by service points that handle PostNord processes such as delivering and receiving parcels. In order to handle parcels without address information, PostNord employees make use of this app and manually enter this information for each parcel. This process can get tedious and time-consuming especially when there is a large volume of parcels to handle. Currently, the app makes use of OCR to parse the following address information:
In this thesis, you will combine text detection with Natural Language Processing (NLP) to detect, process and parse address information. Your goal is to come up with a robust solution to extract address information represented in different address formats from a printed document or handwritten paper. You could use Machine learning techniques, come up with your own algorithm or do both and evaluate the results.