Skip to main content

DocuMine Documentation

Extract term contained in an entity dictionary

When searching to extract one of numerous potential matches, create an entity with dictionary in DocuMine. This allows for the creation of rules that extract applicable dictionary matches from documents/layout elements.

For detailed information on creating an entity with a dictionary, please see Entities and Non-dictionary vs. dictionary entities.

The following rule serves as an example. It targets sections containing animal species, extracts them, and creates the respective entitiess.

The following rule serves as an example. It targets sections that contain animal species, extracts the species, and creates the respective entities for them.

To make your rule work, you need a representation of the entity (“species”) that your rule can refer to: Create a “species” entity in the DocuMine settings, and fill its dictionary with the animal species you want to detect in your documents.

rule "DOC.5.2: Species"
    when
        FileAttribute(label == "OECD Number", valueEqualsAnyOf("402","403"))
        $section: Section(hasEntitiesOfType("species"))
    then
        $section.getEntitiesOfType("species").forEach(entity -> {
            entity.apply("DOC.5.2", "Species found.");
            entity.setValue(entity.getValue().toLowerCase());
        });
    end

Syntax

Explanation

rule "T.5.0"

Name of the rule

Each rule must have a unique name. For further information, please see Rule naming.

FileAttribute(label == "OECD Number", valueEqualsAnyOf("402","403"))

Specifies that the rule applies to documents with the “OECD Number” file attribute value “402” or “403”.

$section: Section(hasEntitiesOfType("species"))

Targets sections that contain entities of the type “species” and binds them to the “$section” variable.”

$section.getEntitiesOfType("species").forEach(entity -> {...});

Iterates over each entity of type "species" in the matched sections and applies the specified actions to each entity.

entity.apply("DOC.5.2", "Species found.");

Applies the "DOC.5.2" identifier and the message "Species found." to each entity created.

entity.setValue(entity.getValue().toLowerCase());

Modifies the value of each matched entity to lowercase.

Notice

For further information about the methods listed in the table, please refer to the Javadoc.