Lightweight DMS/OCR with VB.NET and Microsoft OneNote

Abstract

This article describes how to build a lightweight DMS (document management system) / OCR (optical character recognition) solution with VB.NET (Windows Service) and Microsoft OneNote.

Motivation

The paperless office is still an illusion by today (2016). We are still facing a lot of paper documents in our everyday business. Not only for bookkeeping, we have to archive (and digitize) our paper documents and make sure, we can retrieve these documents later in time.

We had made great experiences with Microsoft OneNote. The tool is perfectly suitable for documenting and sharing almost anything. OneNote - in our opinion - is well capable to at least touching topics like knowledge management, document management, digitizing and optical character recognition.

Process

Results

When paper documents are scanned and sent to OneNote, the documents are automatically put into new Pages denoted with a timestamp. This also means that documents should be scanned as early as possible after they were received. If the scan quality is good enough, OneNote will automatically process the added images and (if possible) build OCR content ("Alt text"). If the quality of the text is poor, tags and keywords can be added manually to each page (or each image) in order to properly retrieve documents later. OneNote notebooks are perfectly capable for team collaboration. Everyone using the server-notebook can see the latest changes as well as the changes of other users. Results can be shared via SharePoint Online or OneDrive even with (company-) external users.

Conclusion

The solution is a low-cost, light-weigth and easy to establish process.
A Microsoft Windows Server and a Microsoft Office (2010 or higher) license is needed.

If you like this solution if you are interested in the source code, please contact us.


Dieter Neumann