Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Copy TeX LaTeX text without TeX markup. e.g:TeX \& freinds -> TeX & friends #32

Open
antmw1361 opened this issue Feb 6, 2018 · 7 comments

Comments

@antmw1361
Copy link

This would be helpful when we copy text from @TEXStudio and paste into plain text boxes
For example, while submitting a paper to a journal we often need to copy paste abstract to a plain text box in ManuscriptCentral, Evise websites. I often have to de-TeX the text and hide LaTeX markups. It would be great if TeXstudio can help.

@dbitouze
Copy link
Contributor

dbitouze commented Feb 6, 2018

You could just copy-paste from the PDF file.

@sunderme
Copy link
Member

sunderme commented Feb 6, 2018

this is a task for scripting/macro. So if anybody wants to publish something here or better on the wiki, go ahead

@dbitouze
Copy link
Contributor

dbitouze commented Feb 7, 2018

Other solutions:

@lesshaste
Copy link

@dbitouze Copying and pasting from pdf is often a fatally bad idea. I find you often get symbols that look very much like what you expect but are subtly different which then appear completely incorrectly when, for example, printed. The most popular culprit is the dash “-“.

@dbitouze
Copy link
Contributor

Copying and pasting from pdf is often a fatally bad idea.

I disagree: if the text you copied isn't meant to be reused in a .tex, it's not a problem and, even better, it is a feature: e.g., you asked an en-dash in your .tex source? You get an en-dash in the PDF output that you luckily can copy and paste.

@thatlittleboy
Copy link
Contributor

@dbitouze I find the most problematic issue when copying from pdf is hyphenation of long words at the end of lines, which you obviously don't want to appear in the resulting paste.

@antmw1361 I find it hard for anyone to help (with scripting) if there are no examples of what kind of text to expect. Would there be inline math? Superscripts, subscripts etc.? References, cites, labels? Anyway, I concur that the easiest way is still to copy from pdf, then fixing hyphenation patterns etc.

@dbitouze
Copy link
Contributor

@thatlittleboy No such an issue with pdftotext. For instance, the resulting PDF of the test.tex file:

\documentclass{article}
\usepackage{lipsum}
\begin{document}
\lipsum[1]
\end{document}

processed with:

pdftotext -nopgbrk test.pdf

generates a test.txt file containing:

Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Ut purus elit,
vestibulum ut, placerat ac, adipiscing vitae, felis. Curabitur dictum gravida
mauris. Nam arcu libero, nonummy eget, consectetuer id, vulputate a, magna.
Donec vehicula augue eu neque. Pellentesque habitant morbi tristique senectus
et netus et malesuada fames ac turpis egestas. Mauris ut leo. Cras viverra
metus rhoncus sem. Nulla et lectus vestibulum urna fringilla ultrices. Phasellus
eu tellus sit amet tortor gravida placerat. Integer sapien est, iaculis in, pretium
quis, viverra ac, nunc. Praesent eget sem vel leo ultrices bibendum. Aenean
faucibus. Morbi dolor nulla, malesuada eu, pulvinar at, mollis ac, nulla. Curabitur auctor semper nulla. Donec varius orci eget risus. Duis nibh mi, congue
eu, accumsan eleifend, sagittis quis, diam. Duis eget orci sit amet orci dignissim
rutrum.

1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants