Skip to main content

DocuMine Documentation

Component rule creation

Component rules bind the extracted entities to components. The component mapping follows the entity extraction and allows for a structured overview of the extracted information. Once the component construction is completed, users can export the components from DocuMine.

Creating a component rule depends on having previously extracted one or more entities using an entity extraction rule. For further information on entity extraction, please see Entity rule creation and Create an entity.

You can construct a component from one or more entities. DocuMine automatically creates a component for each extracted entity; i.e., in the case of a 1:1 mapping, you do not need a separate component rule.

The component construction can include transformations and standardizations.

  • Transformation:

    Transformation is the process of modifying the format or structure of the extracted data, such as changing a date's format or calculating the duration between a given start and end date.

    For extensive transformation needs, such as replacing multiple values by other values, you can upload a CSV file with detailed mappings. That eliminates the need to hardcode all the replacement values in the component rules. A common use case is replacing internal codes with product names. For detailed information on component mapping, please see Component mapping.

  • Standardization:

    Standardization entails aligning varying data representations to standardized values for consistency, such as standardizing a component's capitalization to match a particular standard, even if extracted in an alternate format.

Creating component rules

Component rules target one or multiple extracted entities and apply the required component construction and transformation commands.

Please note: Editing active component rules can lead to errors in the information extraction process. We, therefore, recommend not to edit existing component rules without contacting your knecon representatives or the knecon support.

Component rule skeleton:

rule "<Rule Group>.<Rule Unit>.<Rule Number>: <Rule Name>"
    when
        // Define the target entity for the component construction
    then
        // Apply component construction commands
    end

In the case of component rules, the when-then parts define the following conditions and actions:

When:

The "when" part filters the document for relevant entities, content markers, and layout elements in the entities. (e.g., a specific keyword, dates in certain formats, or layout elements like a table).

Then:

The "then" part instructs DocuMine to create the right component(s), how to bind the identified entity values to the component(s), and how to use them to display the extracted information.

Available methods

For further information about the methods to be used, please refer to the Javadoc.