{"id":14183,"date":"2012-07-24T00:00:00","date_gmt":"2012-07-24T00:00:00","guid":{"rendered":"https:\/\/www.dmcinfo.com\/our-work\/sharepoint-optical-character-recognition-ocr-solution-for-image-only-pdfs\/"},"modified":"2025-07-07T22:06:52","modified_gmt":"2025-07-07T22:06:52","slug":"sharepoint-optical-character-recognition-ocr-solution-for-image-only-pdfs","status":"publish","type":"our_work","link":"https:\/\/www.dmcinfo.com\/our-work\/sharepoint-optical-character-recognition-ocr-solution-for-image-only-pdfs\/","title":{"rendered":"SharePoint Optical Character Recognition (OCR) Solution for Image Only PDFs"},"content":{"rendered":"\n<p>Approximately 60% of the law firm&#8217;s files are PDF files, and 1\/3 of these PDFs are Image Only. The content of PDF files which contain only images cannot be searched.<\/p>\n\n\n\n<p>The legal firm asked DMC for assistance with scanning their existing SharePoint Document Repository&#8217;s 700,000+ files and converting Image Only PDF documents to searchable documents using Optical character recognition (OCR).<\/p>\n\n\n\n<p>In order to help the law firm&#8217;s staff quickly locate key documents, DMC built an application to first scan all existing documents already in SharePoint to determine which were Image Only PDFs. These documents were then processed by an OCR module built upon&nbsp;the&nbsp;<a href=\"http:\/\/www.aquaforest.com\/en\/ocrsdk.asp\" onclick=\"window.open(this.href, '', 'resizable=no,status=no,location=no,toolbar=no,menubar=no,fullscreen=no,scrollbars=no,dependent=no'); return false;\">Aquaforest OCR SDK<\/a>&nbsp;to&nbsp;render the textual content searchable via SharePoint. The legal firm&#8217;s SharePoint document repository of 700,000 files was scanned and converted in approximately 45 days, with a 96% success rate of adding a searchable text layer to image-only PDF files.<\/p>\n\n\n\n<p>A simple <a href=\"https:\/\/www.dmcinfo.com\/blog\/29715\/save-time-money-and-frustration-with-sharepoint-2010-search\/\" target=\"_blank\" rel=\"noreferrer noopener\">SharePoint keyword search<\/a> now instantly retrieves a list of all files containing the specified keyword(s), providing quick access to the information in all of the client&#8217;s document files, saving vital time for their employees and customers.<\/p>\n\n\n\n<p>Since implementing the original SharePoint OCR application, DMC has upgraded the application for compatibility with SharePoint 2010, 2013, 2016, and Office 365 SharePoint Online.&nbsp; Features have&nbsp;also been added to identify newly uploaded PDF files and OCR them&nbsp;multiple times daily, as well as the ability re-scan specific sites and libraries.<\/p>\n\n\n\n<p><strong>For more information on our <a href=\"https:\/\/www.dmcinfo.com\/services\/digital-workplace-solutions\/sharepoint\/ocr-solution\/\" target=\"_blank\" rel=\"noreferrer noopener\">SharePoint OCR Solution<\/a>, please <a href=\"https:\/\/www.dmcinfo.com\/contact\/\">Contact Us<\/a>.<\/strong><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Approximately 60% of the law firm&#8217;s files are PDF files, and 1\/3 of these PDFs are Image Only. The content of PDF files which contain only images cannot be searched. The legal firm asked DMC for assistance with scanning their existing SharePoint Document Repository&#8217;s 700,000+ files and converting Image Only PDF documents to searchable documents [&hellip;]<\/p>\n","protected":false},"author":8,"featured_media":14182,"template":"","meta":{"customer":"Daley Mohan Groble","summary":"<p>DMC&#039;s consulting services team implemented our <a href=\"https:\/\/www.dmcinfo.com\/services\/digital-workplace-solutions\/sharepoint\/ocr-solution\/\">SharePoint OCR Solution<\/a> to convert Image Only PDF documents to searchable text for&nbsp;an established law firm based in Chicago, Illinois. The solution automatically scanned each and every document stored in the SharePoint Document Management System, identified Image Only PDF files, added a text layer to those PDF files via optical character recognition, and automatically re-saved the documents to the SharePoint Document Management System where they could be indexed by SharePoint&#039;s Enterprise Search engine.<\/p>\r\n","description":"","customer_benefits":"<ul>\r\n <li>PDF files can now be indexed by SharePoint Enterprise Search and instantly searched from SharePoint, allowing the legal firm&#039;s staff to quickly locate documents using simple keyword search<\/li>\r\n <li>Automation of the OCR process saved at least 4,000 hours of staff time that would have been required to convert each PDF file individually<\/li>\r\n <li>At least $150,000 was saved by implementing a custom solution when compared to the cost of implementing a packaged OCR software application, which are typically priced at $1+ per OCR&#039;d page<\/li>\r\n <li>Achieved a 96% success rate of adding a searchable text layer to image-only PDF files<\/li>\r\n<\/ul>\r\n","components_used":"<ul>\r\n <li>Microsoft SharePoint 2010<\/li>\r\n <li>Microsoft Office SharePoint Server (MOSS) 2007<\/li>\r\n <li>Microsoft .NET 3.5 Framework<\/li>\r\n <li>Microsoft SQL Server 2008 R2<\/li>\r\n <li>Microsoft Windows Server 2008 R2<\/li>\r\n <li><a href=\"http:\/\/www.aquaforest.com\/en\/ocrsdk.asp\" target=\"_blank\">Aquaforest OCR SDK<\/a><\/li>\r\n<\/ul>\r\n","project":"Daley Mohan Groble SharePoint OCR Project","author":"Rick Rietz","notes":""},"work_category":[708,712],"class_list":["post-14183","our_work","type-our_work","status-publish","has-post-thumbnail","hentry","work_category-digital-workplace-solutions","work_category-sharepoint"],"yoast_head":"<title>SharePoint Optical Character Recognition (OCR) Solution for Image Only PDFs | DMC, Inc.<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.dmcinfo.com\/our-work\/sharepoint-optical-character-recognition-ocr-solution-for-image-only-pdfs\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"SharePoint Optical Character Recognition (OCR) Solution for Image Only PDFs\" \/>\n<meta property=\"og:description\" content=\"Approximately 60% of the law firm&#8217;s files are PDF files, and 1\/3 of these PDFs are Image Only. The content of PDF files which contain only images cannot be searched. The legal firm asked DMC for assistance with scanning their existing SharePoint Document Repository&#8217;s 700,000+ files and converting Image Only PDF documents to searchable documents [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.dmcinfo.com\/our-work\/sharepoint-optical-character-recognition-ocr-solution-for-image-only-pdfs\/\" \/>\n<meta property=\"og:site_name\" content=\"DMC, Inc.\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/pages\/DMC-Inc\/107982009242929\" \/>\n<meta property=\"article:modified_time\" content=\"2025-07-07T22:06:52+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/cdn.dmcinfo.com\/wp-content\/uploads\/2025\/05\/27170142\/OCR-Solution-for-Image-Only-PDFs.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1400\" \/>\n\t<meta property=\"og:image:height\" content=\"500\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"2 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.dmcinfo.com\/our-work\/sharepoint-optical-character-recognition-ocr-solution-for-image-only-pdfs\/\",\"url\":\"https:\/\/www.dmcinfo.com\/our-work\/sharepoint-optical-character-recognition-ocr-solution-for-image-only-pdfs\/\",\"name\":\"SharePoint Optical Character Recognition (OCR) Solution for Image Only PDFs | DMC, Inc.\",\"isPartOf\":{\"@id\":\"https:\/\/www.dmcinfo.com\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.dmcinfo.com\/our-work\/sharepoint-optical-character-recognition-ocr-solution-for-image-only-pdfs\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.dmcinfo.com\/our-work\/sharepoint-optical-character-recognition-ocr-solution-for-image-only-pdfs\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/cdn.dmcinfo.com\/wp-content\/uploads\/2025\/05\/27170142\/OCR-Solution-for-Image-Only-PDFs.png\",\"datePublished\":\"2012-07-24T00:00:00+00:00\",\"dateModified\":\"2025-07-07T22:06:52+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/www.dmcinfo.com\/our-work\/sharepoint-optical-character-recognition-ocr-solution-for-image-only-pdfs\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.dmcinfo.com\/our-work\/sharepoint-optical-character-recognition-ocr-solution-for-image-only-pdfs\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.dmcinfo.com\/our-work\/sharepoint-optical-character-recognition-ocr-solution-for-image-only-pdfs\/#primaryimage\",\"url\":\"https:\/\/cdn.dmcinfo.com\/wp-content\/uploads\/2025\/05\/27170142\/OCR-Solution-for-Image-Only-PDFs.png\",\"contentUrl\":\"https:\/\/cdn.dmcinfo.com\/wp-content\/uploads\/2025\/05\/27170142\/OCR-Solution-for-Image-Only-PDFs.png\",\"width\":1400,\"height\":500,\"caption\":\"(OCR) Solution for Image Only PDFs\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.dmcinfo.com\/our-work\/sharepoint-optical-character-recognition-ocr-solution-for-image-only-pdfs\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Work\",\"item\":\"https:\/\/www.dmcinfo.com\/our-work\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"SharePoint Optical Character Recognition (OCR) Solution for Image Only PDFs\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.dmcinfo.com\/#website\",\"url\":\"https:\/\/www.dmcinfo.com\/\",\"name\":\"DMC, Inc.\",\"description\":\"\",\"publisher\":{\"@id\":\"https:\/\/www.dmcinfo.com\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.dmcinfo.com\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.dmcinfo.com\/#organization\",\"name\":\"DMC, Inc.\",\"url\":\"https:\/\/www.dmcinfo.com\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.dmcinfo.com\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/cdn.dmcinfo.com\/wp-content\/uploads\/2025\/05\/27171146\/dmc-logo-1.png\",\"contentUrl\":\"https:\/\/cdn.dmcinfo.com\/wp-content\/uploads\/2025\/05\/27171146\/dmc-logo-1.png\",\"width\":418,\"height\":167,\"caption\":\"DMC, Inc.\"},\"image\":{\"@id\":\"https:\/\/www.dmcinfo.com\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/pages\/DMC-Inc\/107982009242929\",\"https:\/\/www.instagram.com\/dmcengineering\",\"https:\/\/www.youtube.com\/DMCEngineering\",\"https:\/\/www.linkedin.com\/company\/dmc-engineering\"]}]}<\/script>","yoast_head_json":{"title":"SharePoint Optical Character Recognition (OCR) Solution for Image Only PDFs | DMC, Inc.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.dmcinfo.com\/our-work\/sharepoint-optical-character-recognition-ocr-solution-for-image-only-pdfs\/","og_locale":"en_US","og_type":"article","og_title":"SharePoint Optical Character Recognition (OCR) Solution for Image Only PDFs","og_description":"Approximately 60% of the law firm&#8217;s files are PDF files, and 1\/3 of these PDFs are Image Only. The content of PDF files which contain only images cannot be searched. The legal firm asked DMC for assistance with scanning their existing SharePoint Document Repository&#8217;s 700,000+ files and converting Image Only PDF documents to searchable documents [&hellip;]","og_url":"https:\/\/www.dmcinfo.com\/our-work\/sharepoint-optical-character-recognition-ocr-solution-for-image-only-pdfs\/","og_site_name":"DMC, Inc.","article_publisher":"https:\/\/www.facebook.com\/pages\/DMC-Inc\/107982009242929","article_modified_time":"2025-07-07T22:06:52+00:00","og_image":[{"width":1400,"height":500,"url":"https:\/\/cdn.dmcinfo.com\/wp-content\/uploads\/2025\/05\/27170142\/OCR-Solution-for-Image-Only-PDFs.png","type":"image\/png"}],"twitter_card":"summary_large_image","twitter_misc":{"Est. reading time":"2 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/www.dmcinfo.com\/our-work\/sharepoint-optical-character-recognition-ocr-solution-for-image-only-pdfs\/","url":"https:\/\/www.dmcinfo.com\/our-work\/sharepoint-optical-character-recognition-ocr-solution-for-image-only-pdfs\/","name":"SharePoint Optical Character Recognition (OCR) Solution for Image Only PDFs | DMC, Inc.","isPartOf":{"@id":"https:\/\/www.dmcinfo.com\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.dmcinfo.com\/our-work\/sharepoint-optical-character-recognition-ocr-solution-for-image-only-pdfs\/#primaryimage"},"image":{"@id":"https:\/\/www.dmcinfo.com\/our-work\/sharepoint-optical-character-recognition-ocr-solution-for-image-only-pdfs\/#primaryimage"},"thumbnailUrl":"https:\/\/cdn.dmcinfo.com\/wp-content\/uploads\/2025\/05\/27170142\/OCR-Solution-for-Image-Only-PDFs.png","datePublished":"2012-07-24T00:00:00+00:00","dateModified":"2025-07-07T22:06:52+00:00","breadcrumb":{"@id":"https:\/\/www.dmcinfo.com\/our-work\/sharepoint-optical-character-recognition-ocr-solution-for-image-only-pdfs\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.dmcinfo.com\/our-work\/sharepoint-optical-character-recognition-ocr-solution-for-image-only-pdfs\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.dmcinfo.com\/our-work\/sharepoint-optical-character-recognition-ocr-solution-for-image-only-pdfs\/#primaryimage","url":"https:\/\/cdn.dmcinfo.com\/wp-content\/uploads\/2025\/05\/27170142\/OCR-Solution-for-Image-Only-PDFs.png","contentUrl":"https:\/\/cdn.dmcinfo.com\/wp-content\/uploads\/2025\/05\/27170142\/OCR-Solution-for-Image-Only-PDFs.png","width":1400,"height":500,"caption":"(OCR) Solution for Image Only PDFs"},{"@type":"BreadcrumbList","@id":"https:\/\/www.dmcinfo.com\/our-work\/sharepoint-optical-character-recognition-ocr-solution-for-image-only-pdfs\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Work","item":"https:\/\/www.dmcinfo.com\/our-work\/"},{"@type":"ListItem","position":2,"name":"SharePoint Optical Character Recognition (OCR) Solution for Image Only PDFs"}]},{"@type":"WebSite","@id":"https:\/\/www.dmcinfo.com\/#website","url":"https:\/\/www.dmcinfo.com\/","name":"DMC, Inc.","description":"","publisher":{"@id":"https:\/\/www.dmcinfo.com\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.dmcinfo.com\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.dmcinfo.com\/#organization","name":"DMC, Inc.","url":"https:\/\/www.dmcinfo.com\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.dmcinfo.com\/#\/schema\/logo\/image\/","url":"https:\/\/cdn.dmcinfo.com\/wp-content\/uploads\/2025\/05\/27171146\/dmc-logo-1.png","contentUrl":"https:\/\/cdn.dmcinfo.com\/wp-content\/uploads\/2025\/05\/27171146\/dmc-logo-1.png","width":418,"height":167,"caption":"DMC, Inc."},"image":{"@id":"https:\/\/www.dmcinfo.com\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/pages\/DMC-Inc\/107982009242929","https:\/\/www.instagram.com\/dmcengineering","https:\/\/www.youtube.com\/DMCEngineering","https:\/\/www.linkedin.com\/company\/dmc-engineering"]}]}},"_links":{"self":[{"href":"https:\/\/www.dmcinfo.com\/wp-json\/wp\/v2\/our_work\/14183","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.dmcinfo.com\/wp-json\/wp\/v2\/our_work"}],"about":[{"href":"https:\/\/www.dmcinfo.com\/wp-json\/wp\/v2\/types\/our_work"}],"author":[{"embeddable":true,"href":"https:\/\/www.dmcinfo.com\/wp-json\/wp\/v2\/users\/8"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.dmcinfo.com\/wp-json\/wp\/v2\/media\/14182"}],"wp:attachment":[{"href":"https:\/\/www.dmcinfo.com\/wp-json\/wp\/v2\/media?parent=14183"}],"wp:term":[{"taxonomy":"work_category","embeddable":true,"href":"https:\/\/www.dmcinfo.com\/wp-json\/wp\/v2\/work_category?post=14183"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}