Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add output files class #1311

Merged
merged 50 commits into from
Feb 8, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
e89fbb0
Add a tail method to inspect output files
pmrv Nov 10, 2022
86df6e8
Extract file reading into method
pmrv Nov 10, 2022
460dad9
Use _read_file in tail
pmrv Nov 10, 2022
a3fbe13
Move list files to util
pmrv Nov 10, 2022
e7f308f
Move read_file to util
pmrv Nov 10, 2022
36d3d90
Extract definition of job archive name
pmrv Nov 10, 2022
9a00f53
Add transparent compression support
pmrv Nov 10, 2022
a5965d0
Fix typos and inadvertant recursion in _job_is_compressed/_job_list_f…
pmrv Nov 22, 2022
ecde142
Fix slopiness
pmrv Nov 22, 2022
138e525
Add test and even more bugfixes
pmrv Nov 22, 2022
16d04f8
Format black
pyiron-runner Nov 22, 2022
e3a2bb5
Add efficient reverse reading with the monty package
pmrv Dec 4, 2022
4e4f44d
More line separator replacements
pmrv Dec 4, 2022
9392e49
Fix typo
pmrv Dec 4, 2022
52a7eeb
Merge remote-tracking branch 'origin/main' into tail
jan-janssen Dec 5, 2022
6a4cf9f
Merge branch 'main' into tail
liamhuber Dec 5, 2022
26e0336
Use system specific linesep
pmrv Dec 4, 2022
7d2089e
Format black
pyiron-runner Dec 7, 2022
8ecbfd1
Enable newline translation in windows tests
pmrv Jan 6, 2023
8db9284
Add a simple file browser
pmrv Jan 6, 2023
e1ef8c1
Add output files class
jan-janssen Feb 4, 2024
020c977
Fix working directory bug
jan-janssen Feb 4, 2024
0079986
Merge remote-tracking branch 'origin/main' into output_files
jan-janssen Feb 5, 2024
fc8ad6a
Move OutputFiles to GenericJob
jan-janssen Feb 5, 2024
c137a6d
rename output files to just files
jan-janssen Feb 5, 2024
3146b33
fix tests
jan-janssen Feb 5, 2024
254dc3c
fix tests for windows
jan-janssen Feb 5, 2024
7af0164
Merge remote-tracking branch 'origin/files' into output_files
jan-janssen Feb 6, 2024
086565c
update monty
jan-janssen Feb 6, 2024
c010b5c
fix
jan-janssen Feb 6, 2024
b597a57
more fixes
jan-janssen Feb 6, 2024
7b6b94e
Format black
pyiron-runner Feb 6, 2024
b02aa2f
fix typing
jan-janssen Feb 6, 2024
98a2db5
Merge remote-tracking branch 'origin/output_files' into output_files
jan-janssen Feb 6, 2024
2ad9391
fix tests
jan-janssen Feb 6, 2024
fe1bf2e
fix ipython representation
jan-janssen Feb 6, 2024
5eb2054
Format black
pyiron-runner Feb 6, 2024
36ca620
try to fix tests
jan-janssen Feb 6, 2024
123ee71
Merge remote-tracking branch 'origin/output_files' into output_files
jan-janssen Feb 6, 2024
905bd1a
try to remove \r
jan-janssen Feb 6, 2024
4ee136f
Use working directory rather than job object
jan-janssen Feb 7, 2024
5cca59e
Return files as strings
jan-janssen Feb 7, 2024
7718b98
Format black
pyiron-runner Feb 7, 2024
01a9e70
Merge remote-tracking branch 'origin/main' into output_files
jan-janssen Feb 7, 2024
f48201a
Update pyiron_base/jobs/job/core.py
jan-janssen Feb 7, 2024
2f8cd56
Update pyiron_base/jobs/job/extension/files.py
jan-janssen Feb 7, 2024
5ae4693
Update pyiron_base/jobs/job/extension/files.py
jan-janssen Feb 7, 2024
149fa23
fix test
jan-janssen Feb 7, 2024
e6691f6
Merge remote-tracking branch 'origin/output_files' into output_files
jan-janssen Feb 7, 2024
2ff2c78
fix genericjob test
jan-janssen Feb 7, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .ci_support/environment-docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ dependencies:
- h5io_browser =0.0.7
- h5py =3.10.0
- jinja2 =3.1.3
- monty =2024.2.2
- numpy =1.26.3
- pandas =2.2.0
- pint =0.23
Expand Down
1 change: 1 addition & 0 deletions .ci_support/environment-old.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ dependencies:
- h5io_browser =0.0.6
- h5py =3.6.0
- jinja2 =2.11.3
- monty =2024.2.2
- numpy =1.23.5
- pandas =2.0.0
- pint =0.18
Expand Down
1 change: 1 addition & 0 deletions .ci_support/environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ dependencies:
- h5io_browser =0.0.7
- h5py =3.10.0
- jinja2 =3.1.3
- monty =2024.2.2
- numpy =1.26.3
- pandas =2.2.0
- pint =0.23
Expand Down
24 changes: 18 additions & 6 deletions pyiron_base/jobs/job/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
import os
import posixpath
import shutil
from typing import List
import warnings

from pyiron_base.interfaces.has_groups import HasGroups
Expand All @@ -26,11 +27,15 @@
_job_is_compressed,
_job_compress,
_job_decompress,
_job_list_files,
_job_read_file,
_job_delete_files,
_job_delete_hdf,
_job_remove_folder,
)
from pyiron_base.state import state
from pyiron_base.utils.deprecate import deprecate
from pyiron_base.jobs.job.extension.files import FileBrowser

__author__ = "Jan Janssen"
__copyright__ = (
Expand Down Expand Up @@ -134,6 +139,12 @@ def __init__(self, project, job_name):
def content(self):
return self._hdf5_content

@property
def files(self):
return FileBrowser(working_directory=self.working_directory)

files.__doc__ = FileBrowser.__doc__

@property
def job_name(self):
"""
Expand Down Expand Up @@ -595,6 +606,7 @@ def get_job_id(self, job_specifier=None):
return response[-1]["id"]
return None

@deprecate("use job.files.list()")
def list_files(self):
"""
List files inside the working directory
Expand All @@ -605,9 +617,7 @@ def list_files(self):
Returns:
list: list of file names
"""
if os.path.isdir(self.working_directory):
return os.listdir(self.working_directory)
return []
return _job_list_files(self)

def list_childs(self):
"""
Expand Down Expand Up @@ -898,9 +908,11 @@ def __getitem__(self, item):
"""

if item in self.list_files():
file_name = posixpath.join(self.working_directory, "{}".format(item))
with open(file_name) as f:
return f.readlines()
warnings.warn(
"Using __getitem__ on a job to access files in deprecated: use job.files instead!",
category=DeprecationWarning,
)
return _job_read_file(self, item)

# first try to access HDF5 directly to make the common case fast
try:
Expand Down
118 changes: 118 additions & 0 deletions pyiron_base/jobs/job/extension/files.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
import os
from typing import List
from pyiron_base.jobs.job.util import (
_working_directory_list_files,
_working_directory_read_file,
)


class FileBrowser:
"""
Allows to browse the files in a job directory.

By default this object prints itself as a listing of the job directory and
the files inside.

>>> job.files
/path/to/my/job:
\tpyiron.log
\terror.out

Access to the names of files is provided with :meth:`.list`

>>> job.files.list()
['pyiron.log', 'error.out', 'INCAR']

Access to the contents of files is provided by indexing into this object,
which returns a list of lines in the file

>>> job.files['error.out']
["Oh no\n", "Something went wrong!\n"]

The :meth:`.tail` method prints the last lines of a file to stdout

>>> job.files.tail('error.out', lines=1)
Something went wrong!

For files that have valid python variable names can also be accessed by
attribute notation

>>> job.files.INCAR # doctest: +SKIP
File('INCAR')
"""

__slots__ = ("_working_directory",)

def __init__(self, working_directory):
self._working_directory = working_directory

def _get_file_dict(self):
return {
f.replace(".", "_"): f
for f in _working_directory_list_files(
working_directory=self._working_directory
)
}

def __dir__(self):
return list(self._get_file_dict().keys()) + super().__dir__()

def list(self) -> List[str]:
"""
List all files in the working directory of the job.
"""
return _working_directory_list_files(working_directory=self._working_directory)

def _ipython_display_(self):
path = self._job.working_directory + ":"
files = [
"\t" + f
for f in _working_directory_list_files(
working_directory=self._working_directory
)
]
print(os.linesep.join([path, *files]))

def tail(self, file: str, lines: int = 100):
"""
Print the last lines of a file.

Args:
file (str): filename
lines (int): number of lines to print

Raises:
FileNotFoundError: if the given file does not exist
"""
print(
*_working_directory_read_file(
working_directory=self._working_directory, file_name=file, tail=lines
),
sep="",
)

def __getitem__(self, item):
if item not in _working_directory_list_files(
working_directory=self._working_directory
):
raise KeyError(item)

return File(os.path.join(self._working_directory, item))

def __getattr__(self, item):
try:
return self[self._get_file_dict()[item]]
except KeyError:
raise AttributeError(item) from None


class File(str):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
class File(str):
class File:
__slots__ = ("_path",)
def __init__(self, path: str):
"""
Wrap a file.
Args:
path (str): path to file
"""
self._path = path

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We not actually using any string functionality here and I'd rather not let people accidentally perform string operations on this object.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason why I would like to keep the string part in, is that then I can use os.path.join() on the object, this is what I need to access the path.

def tail(self, lines: int = 100):
print(
*_working_directory_read_file(
working_directory=os.path.dirname(self),
file_name=os.path.basename(self),
Comment on lines +113 to +114
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
working_directory=os.path.dirname(self),
file_name=os.path.basename(self),
working_directory=os.path.dirname(self._path),
file_name=os.path.basename(self._path),

See above.

tail=lines,
),
sep="",
)
Loading
Loading