Quality check: Title should not contain URL #12354

koppor · 2022-09-02T21:34:13Z

Example:

booktitle = {in Symposium on Automotive/Avionics Systems Engineering SAASE, [http://www.jacobsschool.ucsd.edu/GordonCenter/g\_leadership/l\_summer/docs/saase/papers/MeedeniyaAleti](http://www.jacobsschool.ucsd.edu/GordonCenter/g/_leadership/l/_summer/docs/saase/papers/MeedeniyaAleti) Buhnova.pdf},

This is not a valid booktitle; the URL should not be contained. A warning should be displayed.

lzmmxh · 2024-10-09T13:21:12Z

Hi, I’m interested in taking on this issue. Could you assign it to me? @koppor

koppor · 2024-10-13T13:48:37Z

@lzmmxh Done. You find the user-facing documentation at https://docs.jabref.org/finding-sorting-and-cleaning-entries/checkintegrity. With Ctrl+Shift+F you will find the code.

11raphael · 2025-01-24T15:51:11Z

Hello, I am working with @LinusDietz as a member of a KCL student team and we are interested in this issue. Could we be assigned this?

LinusDietz · 2025-01-24T16:03:44Z

I also think this is a good feature to work on. I would suggest you give a short description (here in the issue) on how you want to minimize false positives and consider which fields (besides the title/booktitle) this integrity check should refer to. For example, the author field? This means we (and the @JabRef/developers) can refine the requirements for this issue a bit more before you start writing much code.

For example, these real papers should ideally not be flagged:

Applying Trip@dvice Recommendation Technology to www.visiteurope.com https://dl.acm.org/doi/abs/10.5555/1567016.1567148
Empirical Exploration of Language Modeling for the google.com Query Stream as Applied to Mobile Voice Search https://link.springer.com/chapter/10.1007/978-1-4614-6018-3_8

calixtus · 2025-01-25T07:21:45Z

Since there was no activity by the formerly assigned contributor i'm reassign this issue to @11raphael. As @LinusDietz is monitoring and supporting your progress im looking forward to your pull request. Please open a draft pr early so we can follow your changes too.

RapidShotzz · 2025-01-27T12:50:34Z

Hello @LinusDietz @11raphael , we were thinking of flagging full URLs that are not supposed to appear in the title/booktitle field e.g. Exploring the Impact of Social Media on Education: https://www.example.com/education-impact. Additionally, we also aim to flag URLs that are embedded in the middle of title text, aswell as URLs that are not related to a research topic.

To avoid false positives and incorrect flagging, we want to accept titles that mention website names as part of the topic e.g. Applying Trip@dvice Recommendation Technology to www.visiteurope.com. Moreover, we aim to allow partial URLs/website references in context and domains as part of a technical term.

The integrity check will focus on ensuring that URLs which have a start structure of http://, https:// or www. are not mistakenly included in the title/booktitle fields. In terms of minimising false positives, the check will only flag full URLs that are followed by a path and will avoid flagging domain names or references that are linked to valid research titles.

Besides the title and booktitle fields, we thought about the impact of checking other fields where URLs are not usually expected, such as the Author field. This would ensure that the author field doesn't contain a URL next to the Author's name.

LinusDietz · 2025-01-28T13:01:13Z

sounds good. Go ahead and open the PR. It makes sense to open it early (when it's still work in progress), so we can give feedback earlier.

koppor assigned lzmmxh Oct 13, 2024

koppor transferred this issue from JabRef/jabref-koppor Jan 5, 2025

koppor added the good first issue An issue intended for project-newcomers. Varies in difficulty. label Jan 5, 2025

koppor added this to Good First Issues Jan 5, 2025

github-project-automation bot moved this to Free to take in Good First Issues Jan 5, 2025

calixtus assigned 11raphael and unassigned lzmmxh Jan 25, 2025

calixtus moved this from Free to take to Assigned in Good First Issues Jan 25, 2025

RapidShotzz linked a pull request Jan 30, 2025 that will close this issue

Fix 12354 title should not contain url #12431

Open

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quality check: Title should not contain URL #12354

Quality check: Title should not contain URL #12354

koppor commented Sep 2, 2022

lzmmxh commented Oct 9, 2024

koppor commented Oct 13, 2024

11raphael commented Jan 24, 2025

LinusDietz commented Jan 24, 2025 •

edited

Loading

calixtus commented Jan 25, 2025

RapidShotzz commented Jan 27, 2025

LinusDietz commented Jan 28, 2025

Quality check: Title should not contain URL #12354

Quality check: Title should not contain URL #12354

Comments

koppor commented Sep 2, 2022

lzmmxh commented Oct 9, 2024

koppor commented Oct 13, 2024

11raphael commented Jan 24, 2025

LinusDietz commented Jan 24, 2025 • edited Loading

calixtus commented Jan 25, 2025

RapidShotzz commented Jan 27, 2025

LinusDietz commented Jan 28, 2025

LinusDietz commented Jan 24, 2025 •

edited

Loading