Skip to content

Configure OCR Settings

OCR does not require you to configure any settings before use. But you can use the following settings to tailor OCR to the data and workflow needs of your project.

Set Denoiser and Spellcheck

Denoisier improves the legibility of documents that are hard to read, such as documents that were copied or faxed multiple times. Turning on Denoiser does increase the time required to process an uploaded file, so OCR gives you an additional option of Smart denoiser. Smart denoiser allows the system to selectively apply the process to documents that would benefit and reduces overall processing time.

Spellcheck helps you identify terms that may have been incorrectly entered in the original document.

  1. Under the OCR tab on the left side menu, click Settings.
  2. Under Denoiser Status, click Denoiser on to improve all documents or Smart denoiser to have the system determine what documents require the service.
  3. Under Use Spellcheck?, click Yes.
  4. Click Update Settings. OCR Denoiser

Enable Document Classification

A faxed patient record can contain multiple document types in a single file. Classifying these sections creates a linked table of contents for the document. You can also configure OCR to attempt to classify the document on import.

  1. Under the OCR tab on the left side menu, click Settings.
  2. Under Documentation Classification, click Classification on, or Advanced Classification.
  3. Click Update Settings. OCR Classification
  4. Import a document or open an existing document.
  5. To confirm OCR's classification or set the classification manually, at the top of the open document, click the pencil icon next to the greyed-out text Class: at the top of the document to reveal the Classification panel.
  6. On the panel, if your document type for the current page is listed under the Recent Classifications column, click the document type.
  7. On the panel, if your document type for the current page is not listed under the Recent Classifications column, click in the Search... field and enter a document type to locate an existing type or create a new document type.
  8. Page through the document and repeat the classification process whenever you encounter a new document type.
  9. To display the created table of contents, at the top of the screen, click the Data icon , click Document Details , and click Document Classification.
  10. Under Document Classification, click any Classification section to navigate to that page. OCR Classification

Enable Report Extraction

For common structured report types, such as a CBC panel, OCR can employ specific configurations to more accurately extract data. Any loaded extractors are automatically run against uploaded documents when you enable report extraction.

To obtain a report extractor customized for your workflow, you can contact your LifeOmic representative or create your own extractor. Creating your own extractor requires a technical understanding of JSON, regular expressions, and HL7 FHIR. LifeOmic provides instructions and an extractor builder tool, see Building OC Report Extractors.

  1. Under the OCR tab on the left side menu, click Settings.

  2. Under Report Extraction, click Report extraction on.

  3. Click Update Settings. OCR Report

  4. To verify report extractors are operating: view a document and at the top of the screen, click the Data Tables icon and click on the Report Extractors tab.

  5. The Report Extractors page displays the configured extractors or the message There are no extractors in this project.

Additional Report Extraction Operations

All loaded extractors are automatically run against uploaded documents, but you may want to run the extractors again in the case of additional extractors being loaded after a document was originally processed or other circumstances.

  1. View a document and at the top of the screen, click the Data Tables icon and click on the Report Extractors tab.
  2. To remove any data in the current document generated by an extractor, click Delete All Extractor Data.

  3. To run all extractors against a document, click Rerun All Extractors.

    OCR Report

Run Automated Analysis

OCR uses automated analysis to identify and highlight phrases with data potential. You can configure full or project data based analysis. Full analysis triggers Amazon Comprehend Medical processes and includes large, standard medical databases, such as ICD-10-CM. Full analysis also includes all the operations of project data based analysis. This includes using your PHC project data and any ontologies you loaded for that specific project. Both analyses also look at your project's previously extracted data and infer similarities. Project data based analysis does not employ Amazon Comprehend Medical.

Tip

If you are unsure of the type of analysis needed, configure project data based analysis. Project data based analysis is faster and consumes less resources.

  1. Under the OCR tab on the left side menu, click Settings.
  2. Under Automated Analysis, click Project data based analysis or Full Analysis.
  3. Click Update Settings.
  4. Import a document or view an existing document.
    1. If you are viewing an existing document that was imported before you configured automated analysis, click the Data Tables icon and click Reanalyze Document.
    2. After automated analysis is configured, importing a document automatically triggers the analysis.
  5. At the top of the screen, click the Annotations icon and click Automated Analysis. OCR Data Analysis

  6. Hold down your mouse button and select a highlighted term.

    Tip: You can hit the select icon to extend the selection to the entire line.

  7. From the menu that appears, click the Analyze Selection icon Analyze Selection.

    The Analyze Contents menu appears with suggestions for FHIR data.

  8. Click on the most accurate suggestion. Your choice populates other fields.
    OCR Data Suggestion

  9. Go through the drop-down menus and select the most accurate options and confirm the suggested information or enter more accurate information in the fields.

  10. Once you are satisfied with the information, click Create ....
    OCR Data Field

  11. To view the created data, at the top of the screen, click the Data icon and then the Analyzed Data icon Analyze Selection.

  12. To view tables of the suggested data, click the Data Tables icon and click the Analyzed Suggestions tab.
  13. To download a CSV file of the extracted data, click the Data Tables icon , click the Extracted Data tab, and click Download Extracted Data. OCR Data Download

Analysis Confidence Threshold

You can choose a level of certainty and configure OCR to only display analyzed suggestions that are above that threshold. The default setting is for OCR to show all analyzed suggestions.

Understanding the Analysis Confidence Threshold

When OCR analyzes a document and generates suggestions, the system assigns different levels of certainty to the results. For example, the system may assign a very high level of certainty to the correlation between the term CBC in the source text and the code 58410-2 for a CBC panel, but it may assign a lower level of certainty to the correlation between the term Heart in the source text and the code 18142-0 for a Heart chambers study observation.

In our first example, the user sets the confidence threshold for the OCR analyzed suggestions to Very High Confidence. The Data Tables page displays only the 75 analyzed suggestions that meet or exceed that threshold. In the Procedures section, only five suggestions are shown, including CBC. The Heart suggestion is not displayed, since its degree of certainty falls below the Very High Confidence threshold set by the user.
OCR Example 1

In our second example, the user sets the confidence threshold for the OCR analyzed suggestions to Low Confidence. The Data Tables page now displays the 150 analyzed suggestions that meet or exceed this lowered threshold. In the Procedures section, the number of suggestions increases to 22. The Heart suggestion is now displayed, since its same degree of certainty now meets or exceeds the new Low Confidence threshold set by the user.
OCR Example 1

Set an Analysis Confidence Threshold

  1. Under the OCR tab on the left side menu, click Settings.
  2. Under Analysis Confidence Threshold, expand the menu and click a confidence level.
  3. Click Update Settings. OCR Confidence
  4. To see the Analyzed Suggestions: view a document, and at the top of the screen, click the Data Tables icon .

Configure Review Stages

Review stages help you keep track of where OCR documents are in your workflow by allowing you to assign a label to the document. The label allows anyone who views the document to identify what operations are completed or pending. You can set the number and names of stages based on how your organization operates.

  1. Under the OCR tab on the left side menu, click Settings.
  2. Under Review Stages, click Configure Review Stages.
  3. Enter a list of stages in a list separated by commas, for example, not started, in progress, completed.
  4. Click the check icon .
  5. Click Update Settings. OCR Stages
  6. To set a review stage for a document, view a document and click the Data icon and the Document Details icon .
  7. Under Document Details, expand the Review Stage menu and under Document Review Stage expand the menu and click the desired stage. OCR Stage-Set
  8. To view the review stage of a document, look under the Document Details section of an open document or in the OCR Documents section of the Subjects viewer. OCR Stage-Set

Configure the Search List

In addition to a standard document search function, OCR lets you create a search list of multiple terms that you can access from any document within the project.

  1. Under the OCR tab on the left side menu, click Settings.
  2. Click Add Search List.
  3. Enter a descriptive name in the List Title field.
  4. Enter a search term or multiple search terms separated by commas.
  5. Click Add Search Term.
  6. Click Update Settings. OCR Search List
  7. To access the search list, view a document.
  8. Click the list icon to the right of the search field and click your choice of search lists. OCR Search List Choice

    All the terms on the search list are highlighted in the document.

  9. Click the list icon to the right of the search field to display a list of search results.

OCR Ontology

In addition to the machine learning analysis driven by Amazon Comprehend Medical and its general, public databases, such as ICD-10-CM, OCR can analyze documents using a custom ontology created by your organization. The ontology-based analysis is driven by your needs and a set of key terms and details you provide. This targeted focus allows the ontology-based analysis to be very accurate and efficient.

Creating an OCR ontology is simple. You download the spreadsheet template from OCR and add your desired terms and additional information, such as medical coding. You then upload the completed spreadsheet and configure OCR automated analysis.

When you configure the automated analysis full analysis option, OCR automatically runs both types of analysis during document ingestion. When you configure the project data based analysis option, OCR runs only the ontology-based analysis. Both the resulting Ontologies Suggestions from the ontology analysis and the Smart Suggestions from the Amazon Comprehend analysis display on a document's Data Tables page, which also allows you to filter the sources.

Note

OCR ontology is one of two OCR tools that search a document for a set of terms you provide. The search list is a basic tool that is simple to configure in the app and highlights a limited number of terms. OCR ontology is a sophisticated tool that uses a versioned spreadsheet of potentially thousands of terms and provides detailed suggestions to help you create recorded values.

Create and Upload an OCR Ontology

  1. Under the OCR tab on the left side menu, click Settings and scroll down to the OCR Ontologies section.
  2. Click Download Template. OCR Ontologies Download
  3. Open the template spreadsheet with Microsoft Excel or any spreadsheet program compatible with the Microsoft Excel Open XML Format (.xlsx).
  4. Fill in the spreadsheet cells with the appropriate information using the example and chart below:

    Ontology Example OCR Ontologies Template Ontology Cell Description Chart

    Cell Type Description
    Category This is a user-defined subset of information that includes the display term. For example, the display term heart attack might fall under the category cardiac. You can include multiple terms and separate them with a pipe delimiter (|) (required).
    Display OCR searches for this term and displays it in the analysis suggestions (required).
    Code Any code, such as an ICD-10-CM code or laboratory testing company code, that you want associated with the display term. If you do not have a code, you can add the display term to this field (required).
    System The source for medical coding. If you are not concerned with proper medical coding, you can use http://lifeomic.com/temp (required).
    Synonyms OCR searches for words or abbreviations that have the same meaning as the display term but uses the display term in the suggestion. You can include multiple terms and separate them with a pipe delimiter (|) (optional).
    IsObservation Put Yes if the display term falls under the FHIR Observation resource. Leave blank if it does not. (It is required that at least one of the four resources is marked Yes.)
    IsMedication Put Yes if the display term falls under the FHIR Medication resource. Leave blank if it does not.
    IsCondition Put Yes if the display term falls under the FHIR Condition resource. Leave blank if it does not.
    IsProcedure Put Yes if the display term falls under the FHIR Procedure resource. Leave blank if it does not.
  5. Delete row 2 (the example row) in the spreadsheet and save the ontology spreadsheet to your computer with a useful name.

  6. Click Upload New Ontology Version. OCR Ontologies
  7. Navigate to your saved ontology .xlsx file and click Open.
  8. Complete the Run Automated Analysis procedure.
  9. To view tables of the ontology suggestions, from an open document, click the Data Tables icon and click the Analyzed Suggestions tab.
  10. Confirm the Ontologies box is checked to display the ontology-based suggestions. If Smart Suggestions is checked, click it to deselect Smart Suggestions and hide any Smart Suggestions. Smart Suggestions are generated from the AWS Comprehend Medical analysis. Note: You can filter Ontologies results by category in the Data Tables view or the Document view with suggestions turned on.
    OCR Ontologies Suggestions

Last update: 2021-04-22