What is a Searchable PDF & Why it is Still Important with Copiers?

pdf_icon_fbarfilerSometimes we need to educate right?

At all of my appointments I mention that our devices will scan documents to searchable PDFs. I then follow up with, “Do you know the benefits of being able to create searchable PDFs?”

The reason for the question is that I’ve found out that many don’t know what a searchable PDF is and many are too shy or too embarrassed to state they don’t know what the term means.

It’s awesome when the prospect states that they don’t know and even better when you explain what they can accomplish by having their documents scanned as a searchable PDF.

I posted the information below just about seven years ago on the old blog site when copier manufacturers started to include options to create searchable PDFs from our multifunctional copiers.

What is a searchable PDF?

In simple terms, a searchable PDF is an image (picture) containing the “text” in a layer (usually behind the image and not visible). A scanned document (PDF as image format) is NOT searchable until an OCR (optical character recognition) process is performed on the document. Some scanning hardware can deliver a searchable PDF (the OCR process is performed during the delivery process). A searchable PDF can also be created by PDF distiller software. This process “converts” a digitized file, such as a MS Word document to a PDF format. Because the original document was digitized (contains text) the OCR process is not required and a searchable PDF is rendered. An easy way to determine if a PDF is searchable is to open the document with Adobe Reader or Acrobat and perform a “find” function. If the found “text” is highlighted the document is a searchable PDF.

Benefits of a searchable PDF

Searchable PDFs are useful for retrieving documents from a document repository (full content management) and useful to find the location of a word(s) within the document.

Adobe Systems provides a free downloadable tool known as an iFilter. The iFilter provides a link between the “text” layer of the searchable PDF and an “indexing” engine. This connection provides for retrieval of the document by any word(s) contained in the “text” layer or in the metadata (Title, Subject, Author, Keywords) of a PDF.

Indexing engines include:

The catalog feature of Adobe Acrobat: A powerful engine which provides advanced searching functionality.

  • MS Indexing Services: An “index” maintained at the server level with “load” processing options built into MS server platforms. Note: this application has an unlimited user retrieval tool that leverages this free MS service.
  • MS Desktop Search: A free, downloadable MS powerful tool that maintains an “index” on the desktop of either desktop files, server files or both.
  • MS Sharepoint: Searchable PDF’s can be retrieved with the built-in query tool.
  • Other DMS: Most document management systems can retrieve searchable PDF’s.

The requirements are a full license of TOCR (the OCR engine) and the “captured” document should be a Group 3 or 4 B&W .tif file.

There are two components that are used to create searchable PDFs; The OCR Processor must be set as full page, however, you may OCR all pages, the first page, or identify the pages to OCR.

The best way to explain this to a customer goes somewhat like this:

“I understand you do not have the ability to scan documents has a searchable PDF, is that correct?” “Yes,” replies the prospect. I then state, “I would assume that when you bring up a PDF document and you’re searching for a certain phrase, part number, or word you are scrolling with the mouse, viewing each page and looking for the data that you need, is that correct?” The prospect agrees with me and I then say, “All right, when you open a PDF that is searchable you no longer have to scroll page by page. In Adobe Reader there is a find button. In that button area, you type in the word, part number or data that you are looking for, press enter and that data that you entered will be highlighted in the PDF document.”  It’s wonderful to see their facial expression change right in front of your eyes! 

This is something I sometimes take for granted as a searchable PDF still has tremendous value for certain prospects.  You just need to take the time and educate them.

Good selling!

 

Art Post
About the Author
One of the most recognizable salespeople in the office equipment space and a veteran of 40-plus years in the sales game, ART POST is also the creator of P4P Hotel, a rest stop for salespeople to catch up on the highs, lows and developments in office technology. The site also allows industry pros to touch base with peers and have an open dialog about the state of the industry. Post’s blogs number in the thousands, and his writing has appeared in numerous industry publications. He can be reached at arthurkpost@gmail.com.