OMR Scanner Using Open CV Python

Introduction

Optical Mark Recognition is used for recognizing certain “marks” on an image and using those marks as a point of reference to extract other regions of interest (ROI) on the page. OMR could also be a comparatively new technology and there is on the brink of no documentation on the subject. Current OMR technologies like ScanTron require custom machines designed specifically to scan custom sheets of paper. An OMR algorithm first needs a template page to know where ROIs are in regard to the markers. It then must be able to scan a page and recognize where the markers are. Then using the template, the algorithm can determine where the ROI’s are in regard to the markers. The markers are the black lines on the sides and ROI’s are the bubbles that are checked as shown below
      
Image 1

The further apart the markers are, the higher the accuracy we will achieve. An Image is not always perfect in alignment, it can be rotated by some angle, but we can find the angle of rotation from the markers using the angle between the top right corner and bottom left corner of the template markers. If we find that the markers we scan have something different angle, we rotate the whole page by that angle and it will fix the skewed rotation.

Aim:

Whenever any academic or competitive exam conducted on OMR sheets then we need an OMR Scanner to scan the OMR answer sheet or form to detect the absence or presence of a mark on the specially designed form. Therefore, we propose this software system to check the OMR Sheet having a particular format and store the data to the database so that it can be used whenever it requires.

Objectives:

The following are the objectives of the projects:
Take a bunch of scanned images at once 
Preprocess the image 
Read the responses marked in the OMR 
Store the information to the Database 
Display the data on the screen in a parallel manner so that if there is some incorrect scanned image we can skip it.

Requirements:

System Requirements:
Operating System: Windows or LinuxPython 3.7: Python is an interpreted, high-level, programming language.Opencv: OpenCV (Open Source Computer Vision) is a library of programming functions mainly aimed at real-time computer vision.MySQL: MySQL is an open-source relational database management system (RDBMS). Qt Designer: Qt Designer is the Qt tool for designing and building graphical user interfaces (GUIs) with Qt Widgets.

Hardware Requirements:

Minimum Requirements: Windows or Linux Computer with 4GB RAM and i3 processor. 
Recommended Requirments: Windows or Linux Computer with 8GB RAM and i7 processor.

Literature Survey:

Existing Systems:

Current OMR technologies like ScanTron require custom machines designed specifically to scan custom sheets of paper. These methods work well but the cost to produce the machines and paper is high as well as the inflexibility. If the only scanner is used then there are chances of the images to get rotated by some angle and so to get rid of that extra hardware is used. For different formats, if the sheet we need different machines. 

Disadvantages:

Existing hardware solutions are expensive. The initial purchase of the product is high along with comparatively high maintenance charges. However, the hardware requires periodic maintenance after a certain time and usage. Paper cost is high and is compatible with a specific set of papers and has other export expenses. The minimum downtime required is of a day and no maximum limit, in cases of non-availability of spare parts. It requires an OMR scanner unit to process the corresponding set of sheets. The OMR reader form does not offer easy customization as the software forms.

Proposed System:

The OMR Reader Software will incorporate any document scanner to process a variety of multiple answer choice sheets. The software will allow us to keep a digitized copy of our documents. The OMR software forms do not require exact alignment. It will also automatically straighten the rotated sheets to have a clear response. This OMR software will provide an easy extraction of data that can be exported to different file formats, like XML, CSV, Excel or any of the compatible Database management systems.

System Architecture

This system mainly contains three components:

Image Set: It contains the set of images which need to be checked, which are already checked and erroneous images separately. 
UI : It provides the user interface where a person can easily see the images currently in the processing and based on the image he/she can decide whether to check(record the response) the image or skip it for now so that it can be scanned again and then check it. After checking the image responses will be updated to the database and it will be visible simultaneously to the user. 
Database System: Responses of all the OMR sheets will be recorded in the database. Later we can export it into csv format easily
.
Fig. System Architecture 


Implementation:

Algorithm:

Perform image pre-processing to make the image black and white. For this we use gaussian blurring. Erosion and dilution are used to remove noise from the image near the black marks. 
Crop the image containing black marks. 
Find out the contours and note the centroid of two black marks one from top and one from bottom and them find the angle of rotation. 
   
Image 4.1 Blurred

                                    
  
Image 4.2 After Dilution and Erosion
Rotate the original image and pre-process it for reading the responses filled by the candidate. 
Image 4.3 Image after processing
Store the result to the database

Software Testing:

Software testing is defined as an activity to check whether the actual results match the expected results and to ensure that the software system is Defect free. It involves the execution of a software component or system component to evaluate one or more properties of interest.
Software testing also helps to identify errors, gaps, or missing requirements in contrary to the actual requirements. It can be either done manually or using automated tools. Some prefer saying Software testing as a White Box and Black Box Testing.

Unit Testing:

Unit Testing is a level of software testing where individual units/ components of the software are tested. The purpose is to validate that each unit of the software performs as designed. A unit is the smallest testable part of any software. It usually has one or a few inputs and usually a single output. In this project, we have tested each and every module separately and removed those bugs which occurred. 

Integration Testing:

Integration Testing is a level of software testing where individual units are combined and tested as a group. The purpose of this level of testing is to expose faults in the interaction between integrated units. Test drivers and test stubs are used to assist in Integration Testing. In this project, we have first combined the individual modules and tested them as a group, and removed those bugs which occurred. 

System Testing:

System Testing is a type of software testing that is performed on a complete integrated system to evaluate the compliance of the system with the corresponding requirements. In system testing, integration testing passed components are taken as input. After the complete integration of the system, we performed system testing and removed those bugs which occurred.  

Beta/Acceptance Testing:

 A beta test is a type of testing period for a computer product prior to any sort of commercial or official release. Beta testing is considered the last stage of testing and normally involves distributing the product to beta test sites and individual users ("beta testers") outside the company for real-world exposure. Before submission of this project, we have thoroughly performed beta/acceptance testing by giving our project to our classmates and removed those bugs which occurred.

Screenshots:














         

Conclusion and Future Scope:

OMR scanner used to be hardware centered but here we implemented it using software and it requires only a set of scanned images. Scanned images rotated by some angle are also accepted, so hardware used for proper alignment is not required here. The user interface is also provided so that people having less knowledge can also use this. OMR sheet will be in front of the user and so if sometimes the image is scanned too badly by mistake it can be skipped and can be scanned again. Sometimes candidates don’t fill in their details properly or forget to mark some necessary details so that’s also will be marked properly in the database.

In the future, we can provide user authentication to the system so that we have data about who has done the scanning of the images and if something has happened in an illegal manner we can reach out to the concerned user. With this, we can allow access to the system only to the users who are trained in this software.

CHECKOUT THE GITHUB LINK:

Post a Comment

0 Comments