- DocuMine Documentation
- What's new?
- Earlier releases
- Version 1.1
Version 1.1
Discover the most important new features, fixes, and improvements contained in DocuMine 1.1.
Patch 1.1.8
Improvement:
Date converter extension:
The date converter now also supports the following formats:
dd-MMMM-yyyy
d-MMMM-yyyy
Explanation:
Format
Description
Example
d
Day of the month without leading zero
1st, 2nd, 3rd
dd
Day of the month with suffix and with leading zero
10th, 23rd, 26th
MMMM
Full month name (letters)
January, February, March
Patch 1.1.7
Improvement:
Date converter extension:
The date converter now also supports the following formats:
d['th']['st']['nd']['rd'] MMMM yyyy
e.g. 1st May 1988
dd['th']['st']['nd']['rd'] MMMM yyyy
e.g. 26th September 2022
Explanation:
Format
Description
Example
d
Day of the month without leading zero
1, 2, 3
dd
Day of the month with suffix and with leading zero
01, 02, 12,
MMMM
Full month name (letters)
January, February, March
yyyy
Year as a four-digit number
1988, 2003, 2023
Patch 1.1.6
Improvement:
Date converter extension:
The date converter now also supports the following formats:
MMMM, dd yyyy
MMMM, d yyyy
MMMM dd,yyyy
MMMM d,yyyy
Explanation:
Format
Description
Example
MMMM
Full month name (letters)
January, February, March
d
Day of the month without leading zero
3
dd
Day of the month with leading zero
03
Patch 1.1.5
Fix:
Fixed blank page issue caused by version caching problem
Improvement:
Re-analysis button available for files in error state
The reanalyze button is displayed to dossier owners and assignees for files in error state, helping prevent team blockages.
A file in error state can be identified by the document name being red and not clickable, with the status indicating "Re-processing required".
Re-processing required
When hovering over the list entry, the Analyze file button is displayed.
Analyze file
For further information, please see Document features.
Improvement:
The date converter now also supports the following format:
yyyy MMM dd
Patch 1.1.4
Improvement:
Date converter extension
The date converter now supports additional formats, including:
d mm yyyy MMM-d-yyyy yyyy-MMM-d yyyymmd d.MMMM-yyyy d.MMM.yyyy m-dd-yyyy m.dd.yyyy d MMM yy yyyy/mm/d d/mm/yy yyyy-mm-d d.MMMM.yy d.MMM.yy d-mm-yy d.MMMM.yyyy d mm yyyy yyyy.mm.d. dd MMM. yyyy yyyy MMM d
Explanation:
Format
Description
Example
MMM
Abbreviated month (letters)
Jan, Feb, Mar
MMMM
Full month name (letters)
January, February, March
mm
Month in numbers
01, 06, 11
d
Day of the month without leading zero
3
dd
Day of the month with leading zero
03
d['th']['st']['nd']['rd']
Day of the month with suffix
10th, 1st, 2nd, 23rd
Update rule X.0.0: Remove Entity contained by Entity of same type
Please replace rule X.0.0 with the new version below. This update solves the problem that sometimes rules are still matched because the section has not been updated, although the entity has been removed.
// Rule unit: X.0rule "X.0.0: Remove Entity contained by Entity of same type" salience 65 when $larger: TextEntity($type: type(), $entityType: entityType, !removed()) $contained: TextEntity(containedBy($larger), type() == $type, entityType == $entityType, this != $larger, !hasManualChanges()) then $contained.getIntersectingNodes().forEach(node -> update(node)); $contained.remove("X.0.0", "remove Entity contained by Entity of same type"); retract($contained); end
The following part has been added to the then section of the rule to solve the problem detailed above. It should be added to all X rules that still do not contain it:
$contained.getIntersectingNodes().forEach(node -> update(node));
Patch 1.1.3
Bug fixes:
Fixed headline and footer detection issues
Headline and footer detection issues have been fixed to ensure more reliable identification of headlines and prevent footers from being mistaken for paragraphs.
The rules will need to be modified to ensure the fixes achieve their full effect. Please find further information on the details in the developer section of these patch notes.
Fixed occasional problems with report download generation
Fixed document loading issue in the document viewer
Fixed issue with dictionary entry annotation
Improvement:
Date detection has been improved
Fix:
Fixed headline and footer detection⸺rule changes required
Headline and footer detection issues have been fixed to ensure more reliable identification of headlines and prevent footers from being mistaken for paragraphs.
Please modify the entity rules as follows to ensure these fixes achieve their full effect:
Add "salience 11" to rule H.0.0
This change guarantees rule H.0.0 is executives before rule X.50.1, which removes entities after tables and appendices, to ensure headlines will be annotated on the pages that follow the TOC.
rule "H.0.0: retract table of contents page" salience 11
Adjust rule X.50.0:
Add the following variable to the "when" block of the rule:
$headline: Headline(containsString("SUMMARY"))
Add the following condition to the $section variable:
&& !anyHeadlineContainsStringIgnoreCase("SUMMARY") && !getSectionIdentifier().isChildOf($headline.getSectionIdentifier()) )
These changes ensure that this rule does not affect sections related to or beneath "SUMMARY" headlines.
Complete rule:
rule "X.50.0: Remove entities in Tables and Appendix sections" when $headline: Headline(containsString("SUMMARY")) $section: Section( (anyHeadlineContainsStringIgnoreCase("Appendix") || anyHeadlineContainsStringIgnoreCase("Appendices") || anyHeadlineContainsStringIgnoreCase("Table") || containsAnyString("APPENDICES", "APPENDIX", "TABLE")) && !anyHeadlineContainsStringIgnoreCase("SUMMARY") && !getSectionIdentifier().isChildOf($headline.getSectionIdentifier()) ) $child: Section(getParent() == $section || this == $section) $entity: TextEntity(!getType().equals("batch_number_coa"), intersectingNodes contains $child) then $entity.remove("X.50.0", "Removed due to being in Appendix/Table"); end
Adjust rule DOC.41.0
Change the "when" block as follows:
rule "DOC.41.0: Executive summary" when $headline: Headline(containsString("SUMMARY")) $child: Headline(getSectionIdentifier().isChildOf($headline.getSectionIdentifier()) || getSectionIdentifier().equals($headline.getSectionIdentifier()))
Change the parameter of the createHeadlineEntityIfPresent class in the "then" block:
then createHeadlineEntityIfPresent((Section) $child.getParent(), "executive_summary", "DOC.41.0", "Executive summary (study design) header found.", entityCreationService);
The changes ensure the rule targets headlines specifically containing "SUMMARY" instead of broader sections. Actions are applied to the parent sections of these headlines, broadening the contextual scope and improving the rule's precision.
Complete rule:
rule "DOC.41.0: Executive summary" when $headline: Headline(containsString("SUMMARY")) $child: Headline(getSectionIdentifier().isChildOf($headline.getSectionIdentifier()) || getSectionIdentifier().equals($headline.getSectionIdentifier())) then createHeadlineEntityIfPresent((Section) $child.getParent(), "executive_summary", "DOC.41.0", "Executive summary (study design) header found.", entityCreationService); entityCreationService.bySemanticNodeParagraphsOnly($child.getParent(), "executive_summary", EntityType.ENTITY) .forEach(entity -> { entity.apply("DOC.41.0", "Executive summary (study design) found.", "n-a"); }); end
Patch 1.1.2
Security fixes:
Fixed cross-site scripting vulnerability that could have led to an account takeover
Issue: A malicious PDF file would have been able to execute scripts in a victim's browser, posing risks of session hijacking, website defacement, or redirection to malicious sites.
Resolution: We have enhanced PDF sanitization to better recognize non-standard JavaScript elements, effectively mitigating this risk.
Fixed theoretical clickjacking vulnerability on the dashboard
Issue: Clickjacking allows attackers to trick users into interacting with a page in unintended ways. The former Content Security Policy (CSP) was not restrictive enough to completely prevent clickjacking, should an attacker have figured out a way to exploit this vulnerability.
Resolution: The appropriate CSP 'frame-ancestors' directive has been implemented, sending response headers that instruct the browser to disallow framing from external domains.
Bug fix:
Fixed issue with scrambled text
Issue: If a PDF had a scrambled structure, the text extraction process resulted in scrambled text, despite being handled correctly by the PDF viewer.
Resolution: The PDF processing now includes a descrambling functionality that follows the visual reading order, mimicking the behavior of the PDF viewer.
Bug fixes:
Fixed privilege escalation vulnerability
Issue: Users with the user admin role—limited to managing user accounts—were able to reset the password of application admin accounts (admin role).
Resolution: The user admin role has been restricted to prevent changes to accounts that hold higher privileges and secure application admin accounts against unauthorized modifications.
Bug fixes:
Fixed issue with getPreviousSibling() method in the rules.
Issue: Some file got in error state when using the getPreviousSibling() method in the rules.
Resolution: We have made improvements to better handle edge cases where document navigation in the backend previously resulted in errors.
Patch 1.1.1
Bug fixes:
Typo (excess closing bracket) in the filename when downloading the JSON or XML component file
Missing annotations due to malformed analysis results caused by an error in the rules.
General
If you log into DocuMine without an assigned role, an information page informs you to contact your admin to assign you a role.
For further information, please see Login.
User login without an assigned role
Dossier
You can now download the components extracted from the dossier documents in bulk.
Click on the Component download icon in the dossier features. A small drop-down menu opens, allowing you to select the download format (JSON or XML).
DocuMine bundles the component extracted from the dossier documents into one download file that is downloaded directly to your computer.
For further information, please see Component download.

Component download icon
OpenAPI 3.0
The DocuMine API offers three new endpoints:
GET /api/dossier-templates/{dossierTemplateId}/entity-rules
The endpoint retrieves the entity rules that define components based on a file’s content within the respective dossier.
GET /api/dossier-templates/{dossierTemplateId}/component-rules
The endpoint retrieves the component rules that define components based on a file’s content within the respective dossier.
GET /api/dossier-templates/{dossierTemplateId}/file-attribute-definitions
The endpoint retrieves the file attribute definitions for a given file within the respective dossier.
The API documentation is provided to you as an OpenAPI 3.0 specification. Please ask your support contact for the specification.
For further information on the OpenAPI specification, please see: https://swagger.io/specification/
Entity rules
We recommend adding the following entity rule. It prevents annotations from being skipped when users change the type of an annotation. This will allow the recategorization to be applied directly.
rule "MAN.3.3: Apply recategorization entities by default" salience 128 when $entity: IEntity(getManualOverwrite().getRecategorized().orElse(false)) then $entity.apply("MAN.3.3", "Recategorized entities are applied by default."); end
We recommend changing rule MAN.1.0 as follows:
rule "MAN.1.0: Apply id removals that are valid and not in forced redactions to Entity" salience 128 when $idRemoval: IdRemoval( $id: annotationId, !removeFromDictionary, !removeFromAllDossiers, status == AnnotationStatus.APPROVED) $entityToBeRemoved: TextEntity(matchesAnnotationId($id)) then $entityToBeRemoved.getManualOverwrite().addChange($idRemoval); update($entityToBeRemoved); retract($idRemoval); $entityToBeRemoved.getIntersectingNodes().forEach(node -> update(node)); end
Behavior so far: If you remove a dictionary-based annotation manually and then add this same annotation again, the annotation is skipped.
New behavior: The annotation is applied and not skipped.