GetText component enables users to extract text from the pdf file.
Double Click on the
GetText component title bar to open the
EXTRACTOR SETTINGS Window.
- Click the Checkbox to use OCR for Extraction.
a. User can select an OCR engine form the list.
b. User have to provide the credentials if a commercial OCR engine is being used from the list.
(ex: Access key, Secret Key, Api Key, Application Id, Cloud Password, License Key etc.)
GetText component exposes the Control In, Control Out, Data In and Data Out ports by default.
|ControlIn||Must be connected to the Control Out port of one or more components.|
|ControlOut||Can be connected to the Control In port of another component or the default end component.|
|Data In||The GetText component exposes the following Data In ports by default:
|Data Out||Returns the content of the pdf document.|
To edit the properties of the
GetText component, in the Properties window, change the required property. You can edit the following properties.
|Search||Search for the respective property.|
|Delay After Execution||Specifies the wait time (in seconds) after the action is performed.|
|Delay Before Execution||Specifies the wait time (in seconds) before the action is performed.|
Let us consider an example.
To extract the Data from PDF file:
In the Toolbox, expand
Utilitiesand then expand
GetTextcomponent and drop it on the Design surface.
Filepathbox and enter the required path.
User can specify the Page number of the Pdf file for page specific extraction.
To override the existing data source, right-click Pdf FilePath.
Click override and change the data source.
To learn more about overriding the data source of the data port, refer the Override section.
Double-click on GetText title bar then “EXTRACTOR SETTINGS” window will open.
Click the checkbox and select an OCR engine from the list.
If the user wants to use a commercial OCR engine, then required credentials need to be provided.
(In this example we have selected “Windows” as the OCR Engine Type.)
Click on Ok.
Drag the MessageBox
showcomponent and drop it on the design surface.
Connect the control ports and the data ports in the activity.
In the toolbar, click Run.