Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eliminate Leaks by wrapping resources in contextlib with-blocks #36

Closed
wants to merge 21 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
941b6c7
feat(src): subst `reduce` with `sum` for size calcs
ankostis Oct 24, 2016
2ec8c99
feat(io): Retrofit streams as context-managers.
ankostis Oct 1, 2016
758c293
feat(io): breaking API: retrofit Packers as context-managers!
ankostis Oct 2, 2016
ba10cf1
chore(ci): depend on "leaks" smmap branch
ankostis Oct 24, 2016
b199bd5
chore(ci): depend on "leaks" smmap branch
ankostis Oct 24, 2016
f0988cc
chore(gitdb): actually delete submodule from sources
ankostis Oct 24, 2016
08b1f5f
chore(ver): bump 2.0.0-->2.1.0.dev0
ankostis Oct 24, 2016
534c9bb
fix(win): FIX and HIDE 2 win-errors remaining
ankostis Oct 24, 2016
47e8884
fix(pack): restore packers as LazyMixins
ankostis Oct 24, 2016
79a754a
chore(ver): bump 2.0.0.dev0-->2.1.0.dev2 (yes, last ver was old)
ankostis Oct 24, 2016
a2e49d9
style(listuple): use literals for empty lists/tuples
ankostis Oct 24, 2016
61ea9bb
test(travis): enable all tests (inc perf) on TravisCI
ankostis Oct 25, 2016
a566e11
refact(win_errs): move HIDE_WINDOWS_KNOWN_ERRORS from main-code to test
ankostis Oct 25, 2016
fda6bc1
fix(io): BREAKING, wrap more out-stream usages
ankostis Oct 25, 2016
7aa9590
doc(changes): describe v2.1.0 changes on API
ankostis Oct 25, 2016
efaa6a1
chore(deps): pin dev dependencies on requirements text
ankostis Oct 25, 2016
fa24dce
feat(streams): use named-tuples
ankostis Oct 24, 2016
7524f69
fix(compat): PY3-check must hold even for PY4
ankostis Oct 25, 2016
c63db69
refact(elapsed): improve no div0 when time-elapsed too small
ankostis Oct 25, 2016
40199ba
refact(util): ask global `util.mman` from mman module
ankostis Oct 27, 2016
fa89615
chore(ver): bump 2.0.0.dev1-->2.1.0.dev3, and more
ankostis Oct 27, 2016
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .appveyor.yml
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ install:
git config --global user.email "[email protected]"
git config --global user.name "Travis Runner"

- pip install -e .
- pip install -r requirements.txt

build: false

Expand Down
3 changes: 0 additions & 3 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -1,3 +0,0 @@
[submodule "smmap"]
path = gitdb/ext/smmap
url = https://github.com/Byron/smmap.git
1 change: 1 addition & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ git:
depth: 1000
install:
- pip install coveralls
- pip install -r requirements.txt
script:
- ulimit -n 48
- ulimit -n
Expand Down
69 changes: 55 additions & 14 deletions doc/source/changes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,42 +2,83 @@
Changelog
#########

*****
2.1.0
* **BREAKING API:** retrofit streams and (internal) packers as context-managers.

Specifically if you are using directly the packers
(``git.pack.PackIndexFile``, ``git.pack.PackFile`` & ``git.pack.PackEntity``)
they must always be used from within a ``with ..`` block, or else
you will get *mmap-cursor* missing errors.

.. Tip::

You can "enter" `PackIndexFile`` & ``PackFile`` multiple time, but ``PackEntity`` only once
to detect and avoid sloppy deep-nesting.
Since ``git.pack.PackEntity`` class just coalseces ``PackIndexFile`` & ``PackFile``,
you may "enter" either each internal packer separately, or the entity only once.

* **BREAKING API:** some utilities moved between ``git.util``, ``git.const`` & ``git.utils.compat``.
* Fix (probably) all leaks in Windows.

.. Note::

The problem is that on Linux, any open files go unoticed, or collected by GC.
But on *Windows* (and specifically on PY3 where GC is not deterministic),
the underlying files cannot delete due to *access violation*.

That's a Good-thing|copy|, because it is dangerous to *leak* memory-mapped handles.
Actually *Windows* may leak them even after process who created them have died,
needing a full restart(!) to clean them up (signing-out is not enough).


* Stop importing *on runtime* *smmap* submodule - deleted completely submodule from sources.

.. Tip::

Developer now has to specify specific dependency to *smmap* in ``requirements.txt`` file, and
remember to updated it before a final release.

* Run TCs also on Appveyor.


0.6.1
*****
=====

* Fixed possibly critical error, see https://github.com/gitpython-developers/GitPython/issues/220

- However, it only seems to occur on high-entropy data and didn't reoccour after the fix

*****

0.6.0
*****
=====

* Added support got python 3.X
* Removed all `async` dependencies and all `*_async` versions of methods with it.

*****

0.5.4
*****
=====
* Adjusted implementation to use the SlidingMemoryManager by default in python 2.6 for efficiency reasons. In Python 2.4, the StaticMemoryManager will be used instead.

*****

0.5.3
*****
=====
* Added support for smmap. SmartMMap allows resources to be managed and controlled. This brings the implementation closer to the way git handles memory maps, such that unused cached memory maps will automatically be freed once a resource limit is hit. The memory limit on 32 bit systems remains though as a sliding mmap implementation is not used for performance reasons.

*****

0.5.2
*****
=====
* Improved performance of the c implementation, which now uses reverse-delta-aggregation to make a memory bound operation CPU bound.

*****

0.5.1
*****
=====
* Restored most basic python 2.4 compatibility, such that gitdb can be imported within python 2.4, pack access cannot work though. This at least allows Super-Projects to provide their own workarounds, or use everything but pack support.

*****

0.5.0
*****
=====
Initial Release


.. |copy| unicode:: U+000A9 .. COPYRIGHT SIGN
59 changes: 29 additions & 30 deletions doc/source/tutorial.rst
Original file line number Diff line number Diff line change
Expand Up @@ -35,29 +35,28 @@ Databases support query and/or addition of objects using simple interfaces. They
Both have two sets of methods, one of which allows interacting with single objects, the other one allowing to handle a stream of objects simultaneously and asynchronously.

Acquiring information about an object from a database is easy if you have a SHA1 to refer to the object::


ldb = LooseObjectDB(fixture_path("../../../.git/objects"))

for sha1 in ldb.sha_iter():
oinfo = ldb.info(sha1)
ostream = ldb.stream(sha1)
assert oinfo[:3] == ostream[:3]

assert len(ostream.read()) == ostream.size
# END for each sha in database

with =ldb.stream(sha1) as ostream:
assert oinfo[:3] == ostream[:3]

assert len(ostream.read()) == ostream.size

To store information, you prepare an *IStream* object with the required information. The provided stream will be read and converted into an object, and the respective 20 byte SHA1 identifier is stored in the IStream object::

data = "my data"
istream = IStream("blob", len(data), StringIO(data))
# the object does not yet have a sha
assert istream.binsha is None
ldb.store(istream)
# now the sha is set
assert len(istream.binsha) == 20
assert ldb.has_object(istream.binsha)
with IStream("blob", len(data), StringIO(data)) as istream:

# the object does not yet have a sha
assert istream.binsha is None
ldb.store(istream)
# now the sha is set
assert len(istream.binsha) == 20
assert ldb.has_object(istream.binsha)

**********************
Asynchronous Operation
Expand All @@ -67,33 +66,33 @@ For each read or write method that allows a single-object to be handled, an *_as
Using asynchronous operations is easy, but chaining multiple operations together to form a complex one would require you to read the docs of the *async* package. At the current time, due to the *GIL*, the *GitDB* can only achieve true concurrency during zlib compression and decompression if big objects, if the respective c modules where compiled in *async*.

Asynchronous operations are scheduled by a *ThreadPool* which resides in the *gitdb.util* module::

from gitdb.util import pool

# set the pool to use two threads
pool.set_size(2)

# synchronize the mode of operation
pool.set_size(0)


Use async methods with readers, which supply items to be processed. The result is given through readers as well::

from async import IteratorReader

# Create a reader from an iterator
reader = IteratorReader(ldb.sha_iter())

# get reader for object streams
info_reader = ldb.stream_async(reader)

# read one
info = info_reader.read(1)[0]

# read all the rest until depletion
ostreams = info_reader.read()



*********
Databases
Expand Down
27 changes: 4 additions & 23 deletions gitdb/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,33 +7,14 @@
import sys
import os

#{ Initialization


def _init_externals():
"""Initialize external projects by putting them into the path"""
for module in ('smmap',):
sys.path.append(os.path.join(os.path.dirname(__file__), 'ext', module))

try:
__import__(module)
except ImportError:
raise ImportError("'%s' could not be imported, assure it is located in your PYTHONPATH" % module)
# END verify import
# END handel imports

#} END initialization

_init_externals()

__author__ = "Sebastian Thiel"
__contact__ = "[email protected]"
__homepage__ = "https://github.com/gitpython-developers/gitdb"
version_info = (2, 0, 0)
version_info = (2, 1, 0, 'dev3')
__version__ = '.'.join(str(i) for i in version_info)


# default imports
from gitdb.base import *
from gitdb.db import *
from gitdb.stream import *
from gitdb.base import * # @IgnorePep8
from gitdb.db import * # @IgnorePep8
from gitdb.stream import * # @IgnorePep8
Loading