Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preparation release 0.8.1 #1123

Merged
merged 28 commits into from
Sep 14, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
97cd71d
support HAL ID from consolidation service
kermitt2 Feb 25, 2024
a7227b5
query by hal id only
kermitt2 Feb 25, 2024
22d602e
quick fix for #1113
kermitt2 May 9, 2024
a77114d
Merge branch 'master' into glutton-0.3
lfoppiano Jun 9, 2024
0a95872
update readme and changelog
lfoppiano Jun 9, 2024
b6a2a20
update documentation
lfoppiano Jun 9, 2024
4675511
update doc, cleaning, support python env>3.9
kermitt2 Jun 16, 2024
f1d703c
use crossref by default for consolidation
kermitt2 Jun 17, 2024
c408076
fix delft command
lfoppiano Jun 17, 2024
50860c5
Merge branch 'master' into release-0.8.1
lfoppiano Jun 17, 2024
ca18bd6
Merge branch 'refs/heads/master' into glutton-0.3
lfoppiano Jun 21, 2024
8210d69
remove jdk 17 only constructs
lfoppiano Jul 3, 2024
56d351c
downgrade to jdk 11
lfoppiano Jul 3, 2024
99f653a
Merge branch 'refs/heads/master' into release-0.8.1
lfoppiano Jul 17, 2024
8d3a0de
update benchmarks
kermitt2 Aug 9, 2024
ef9123f
update benchmark markdown
kermitt2 Aug 9, 2024
cb570a6
doc with new benchmarks
kermitt2 Aug 10, 2024
b6af547
correct internal links
lfoppiano Aug 12, 2024
a266e73
avoid null pointer exceptions during training data generation with ce…
lfoppiano Aug 12, 2024
0019c54
fix merging
kermitt2 Aug 15, 2024
4b2bda6
Merge branch 'glutton-0.3' into release-0.8.1
kermitt2 Aug 15, 2024
fc70816
udpate github actions and fix dockerfile
lfoppiano Aug 27, 2024
a2720a3
update secrets env variables
lfoppiano Aug 30, 2024
7025697
add suffix when building manually
lfoppiano Aug 30, 2024
5b045af
Merge branch 'master' into release-0.8.1
lfoppiano Aug 30, 2024
e40e30c
only add the suffix to the latest-crf
lfoppiano Aug 30, 2024
8cc06b7
avoid gradle to download jdks automatically, and use the old source/t…
lfoppiano Aug 30, 2024
d15e4d2
fix kotlin and java target warning
lfoppiano Aug 30, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/ci-build-manual-crf.yml
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,6 @@ jobs:
registry: docker.io
pushImage: true
tags: |
latest-develop${{ github.event.inputs.suffix != '' && '-' || '' }}${{ github.event.inputs.suffix }}, latest-crf${{ github.event.inputs.suffix != '' && '-' || '' }}${{ github.event.inputs.suffix }}
latest-develop, latest-crf${{ github.event.inputs.suffix != '' && '-' || '' }}${{ github.event.inputs.suffix }}
- name: Image digest
run: echo ${{ steps.docker_build.outputs.digest }}
6 changes: 3 additions & 3 deletions .github/workflows/ci-build-unstable.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,11 +13,11 @@ jobs:

steps:
- uses: actions/checkout@v4
- name: Set up JDK 17
- name: Set up JDK 11
uses: actions/setup-java@v4
with:
java-version: '17.0.10+7'
distribution: 'temurin'
java-version: '11'
distribution: 'adopt'
cache: 'gradle'
- name: Build with Gradle
run: ./gradlew clean assemble --info --stacktrace --no-daemon
Expand Down
23 changes: 23 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,29 @@ All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).

## [0.8.1] - 2024-06-10

### Added
- Identified URLs are now added in the TEI output #1099
- Added DL models for patent processing #1082
- Copyright and licence identification models #1078
- Add research infrastructure recognition for funding processing #1085

### Changed
- Improved the recognition of URLs using (when available) PDF annotations, such as clickable links
- Updated TEI schema #1084
- Review patent process #1082
- Add Kotlin language to support development and testing #1096

### Fixed
- Sentence segmentation avoids to split sentences with an URL in the middle #1097
- Sentence segmentation is now applied to funding and acknowledgement #1106
- Docker image was optimized to reduce the needed space #1088
- Fixed OOBE when processing large quantities of notes #1075
- Corrected `<title>` coordinate attribute name #1070
- Fix missing coordinates in paragraph continuation #1076
- Fixed JSON log output

## [0.8.0] - 2023-11-19

### Added
Expand Down
3 changes: 1 addition & 2 deletions Readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -105,11 +105,10 @@ Detailed end-to-end [benchmarking](https://grobid.readthedocs.io/en/latest/Bench
A series of additional modules have been developed for performing __structure aware__ text mining directly on scholar PDF, reusing GROBID's PDF processing and sequence labelling weaponry:

- [software-mention](https://github.com/ourresearch/software-mentions): recognition of software mentions and associated attributes in scientific literature
- [datastet](https://github.com/kermitt2/datastet): identification of named and implicit research datasets and associated attributes in scientific articles
- [datastet](https://github.com/kermitt2/datastet): identification of sections and sentences introducing datasets in a scientific article, identification of dataset names and attributes (implict and named datasets) and classification of the type of datasets
- [grobid-quantities](https://github.com/kermitt2/grobid-quantities): recognition and normalization of physical quantities/measurements
- [grobid-superconductors](https://github.com/lfoppiano/grobid-superconductors): recognition of superconductor material and properties in scientific literature
- [entity-fishing](https://github.com/kermitt2/entity-fishing), a tool for extracting Wikidata entities from text and document, which can also use Grobid to pre-process scientific articles in PDF, leading to more precise and relevant entity extraction and the capacity to annotate the PDF with interactive layout
- [datastet](https://github.com/kermitt2/datastet): identification of sections and sentences introducing datasets in a scientific article, identification of dataset names (implict and named datasets) and classification of the type of these datasets
- [grobid-ner](https://github.com/kermitt2/grobid-ner): named entity recognition
- [grobid-astro](https://github.com/kermitt2/grobid-astro): recognition of astronomical entities in scientific papers
- [grobid-bio](https://github.com/kermitt2/grobid-bio): a toy bio-entity tagger using BioNLP/NLPBA 2004 dataset
Expand Down
31 changes: 21 additions & 10 deletions build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -60,19 +60,29 @@ subprojects {
}
}

// sourceCompatibility = 1.11
// targetCompatibility = 1.11

kotlin {
jvmToolchain(17)
}

java {
toolchain {
languageVersion.set(JavaLanguageVersion.of(17))
sourceCompatibility = 1.11
targetCompatibility = 1.11

tasks.withType(KotlinCompile).configureEach {
sourceCompatibility = JavaVersion.VERSION_11
targetCompatibility = JavaVersion.VERSION_11
kotlinOptions {
jvmTarget = JavaVersion.VERSION_11
}
}

// kotlin {
// jvmToolchain(11)
// }

// java {
// toolchain {
// languageVersion.set(JavaLanguageVersion.of(11))
// vendor.set(JvmVendorSpec.ADOPTIUM)
//
// }
// }

repositories {
mavenCentral()
maven {
Expand Down Expand Up @@ -316,6 +326,7 @@ project("grobid-home") {
}

import org.apache.tools.ant.taskdefs.condition.Os
import org.jetbrains.kotlin.gradle.tasks.KotlinCompile

project(":grobid-service") {
apply plugin: 'application'
Expand Down
Loading