Entity rule creation
Entity rules define the target layout element to be extracted for the entity and provide the information extraction commands, serving as the initial step for component extraction that is followed by the process of component mapping.
Creating entity rules
In order to extract an entity, you need a representation of the entity within DocuMine. For further information, please see Create an entity.
Entity rule skeleton:
rule "<Rule Group>.<Rule Unit>.<Rule Number>: <Rule Name>" when // Define the target layout element for the information extraction then // Apply information extraction commands end
In the case of entity rules, the when-then parts define the following conditions and actions:
When:
The "when" part filters the document for relevant layout elements (semantic nodes) and content markers (e.g., a specific keyword or document type).
Then:
The "then" part instructs DocuMine on what to do with the information identified within the filtered layout elements, how to apply the annotation, and how to extract the information (obtain entity values) to be included in a component.
Example:
rule "T.2.0" when $paragraph: Paragraph( getHeadline().containsStringIgnoreCase("references") && containsString("GLP") ) then entityCreationService.bySemanticNode($paragraph, "glp_reference", EntityType.ENTITY) .ifPresent( entity -> entity.apply("T.2.0","Reference paragraph found.") ); end Available methods