This patch series moves the code which talks to the remote lookaside cache, rewriting parts of it. This makes the code more maintainable, and provides a nicer, easier to reuse and extend API.
At the same time, some features are added which could allow some downstreams (fedpkg for example, but also probably centpkg) to remove some of their code completely, without losing any functionality.
Speaking of downstream users of the pyrpkg API, this patch series does not break the API, downstreams don't actually require any code change to keep working as they currently do.
All of this is thoroughly tested, the newly introduced modules are fully covered by unit tests. As a result, global test coverage goes from 26% to 36%.
In order to make this more maintainable and better tested, the code now does not call the 'curl' command as a subprocess any more. Instead, everything uses pycurl, which was already used in pyrpkg and as such no new runtime dependency is introduced.
Some code paths are marked as deprecated, and will emit messages when they are executed, so that downstream users of the pyrpkg API can learn about the deprecations, and port their code. That code could then be removed from pyrpkg eventually, but this patch series does not do that.
The new APIs are entirely documented. Try this for example:
from pyrpkg.lookaside import CGILookasideCache help(CGILookasideCache)
As a cherry on the cake, the new modules (and their unit tests) are fully compatible with Python 3, so we'll have less porting work when the time comes. Of course, the code still is compatible with Python 2.6, so things should still work on EL6.
Finally, these changes are going to help immensely when we try to move away from MD5 checksums for the lookaside cache in Fedora.
All in all, this is a lot of changes:
src/pyrpkg/__init__.py | 263 +++++++------------- src/pyrpkg/cli.py | 8 +- src/pyrpkg/errors.py | 49 ++++ src/pyrpkg/lookaside.py | 299 ++++++++++++++++++++++ src/pyrpkg/sources.py | 14 +- src/pyrpkg/utils.py | 63 +++++ test/test_lookaside.py | 591 ++++++++++++++++++++++++++++++++++++++++++++ test/test_utils.py | 159 ++++++++++++ 8 files changed, 1258 insertions(+), 188 deletions(-)
I've made a copr where I've pushed packages built with these changes, and a couple of packager-friends have been using them for a few days already. They are heavy-users of pyrpkg (through fedpkg, rhpkg and rdopkg) and all the problems they could find are fixed in this patch series.
From: Mathieu Bridon bochecha@daitauha.fr
This makes the code more compatible with Python 3, and less prone to subtle mistakes. --- src/pyrpkg/__init__.py | 46 +++++++++++++++++++++++----------------------- src/pyrpkg/cli.py | 8 ++++---- 2 files changed, 27 insertions(+), 27 deletions(-)
diff --git a/src/pyrpkg/__init__.py b/src/pyrpkg/__init__.py index 97b1171..6538b76 100644 --- a/src/pyrpkg/__init__.py +++ b/src/pyrpkg/__init__.py @@ -286,7 +286,7 @@ class Commands(object): defaults['ca'], defaults['serverca'], proxyuser=self.runas) - except koji.ssl.SSLCommon.SSL.Error, error: + except koji.ssl.SSLCommon.SSL.Error as error: for (_, _, ssl_reason) in error.message: # Use heuristic. Some OpenSSL libs doesn't store error # codes @@ -333,12 +333,12 @@ class Commands(object): else: try: localbranch = self.repo.active_branch.name - except TypeError, e: + except TypeError as e: raise rpkgError('Repo in inconsistent state: %s' % e) try: merge = self.repo.git.config('--get', 'branch.%s.merge' % localbranch) - except git.GitCommandError, e: + except git.GitCommandError as e: raise rpkgError('Unable to find remote branch. Use --dist') # Trim off the refs/heads so that we're just working with # the branch name @@ -380,11 +380,11 @@ class Commands(object): try: url = self.repo.git.config('--get', 'remote.%s.pushurl' % self.branch_remote) - except git.GitCommandError, e: + except git.GitCommandError as e: try: url = self.repo.git.config('--get', 'remote.%s.url' % self.branch_remote) - except git.GitCommandError, e: + except git.GitCommandError as e: raise rpkgError('Unable to find remote push url: %s' % e) self._push_url = url
@@ -557,7 +557,7 @@ class Commands(object): stdout=subprocess.PIPE, stderr=subprocess.PIPE) output, err = proc.communicate() - except Exception, e: + except Exception as e: if err: self.log.debug('Errors occoured while running following' ' command to get N-V-R-E:') @@ -847,7 +847,7 @@ class Commands(object): stdout=sys.stdout, stderr=sys.stderr, cwd=cwd) except (subprocess.CalledProcessError, - OSError), e: + OSError) as e: raise rpkgError(e) except KeyboardInterrupt: raise rpkgError() @@ -877,7 +877,7 @@ class Commands(object): stdout=subprocess.PIPE, stderr=subprocess.PIPE, cwd=cwd) output, error = proc.communicate() - except OSError, e: + except OSError as e: raise rpkgError(e) self.log.info(output) if proc.returncode: @@ -1021,7 +1021,7 @@ class Commands(object): proc = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE) output, error = proc.communicate() - except OSError, e: + except OSError as e: raise rpkgError(e) name = output if error: @@ -1039,7 +1039,7 @@ class Commands(object): stderr=subprocess.PIPE, env=env) output, error = proc.communicate() - except OSError, e: + except OSError as e: raise rpkgError(e) # work around signed SRPMs, for these rpm -qpl might print a warning # like: @@ -1186,14 +1186,14 @@ class Commands(object): # Create our new top directory try: os.mkdir(top_path) - except (OSError), e: + except OSError as e: raise rpkgError('Could not create directory for module %s: %s' % (module, e))
# Create a bare clone first. This gives us a good list of branches try: self.clone(module, top_path, bare_dir=repo_path, anon=anon) - except Exception, e: + except Exception as e: # Clean out our directory shutil.rmtree(top_path) raise @@ -1217,7 +1217,7 @@ class Commands(object): branch_git.config("--replace-all", "remote.%s.url" % self.default_branch_remote, giturl) - except (git.GitCommandError, OSError), e: + except (git.GitCommandError, OSError) as e: raise rpkgError('Could not locally clone %s from %s: %s' % (branch, repo_path, e))
@@ -1505,7 +1505,7 @@ class Commands(object): (output, errors) = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, cwd=self.path).communicate() - except Exception, e: + except Exception as e: raise rpkgError('Error running gendiff: %s' % e)
# log any errors @@ -1648,7 +1648,7 @@ class Commands(object): try: self.log.info(self.repo.git.checkout('-b', branch, '--track', totrack)) - except Exception, err: + except Exception as err: # This needs to be finer grained I think... raise rpkgError('Could not create branch %s: %s' % (branch, err)) @@ -1824,7 +1824,7 @@ class Commands(object): else: try: build_reference = self.nvr - except rpkgError, error: + except rpkgError as error: self.log.warning(error) if nvr_check: self.log.info('Note: You can skip NVR construction & NVR' @@ -2142,13 +2142,13 @@ class Commands(object): if os.path.exists(system_filename): try: shutil.copy2(system_filename, tmp_filename) - except Exception, error: + except Exception as error: raise rpkgError('Failed to create copy system config file' ' %s: %s' % (filename, error)) else: try: open(tmp_filename, 'w').close() - except Exception, error: + except Exception as error: raise rpkgError('Failed to create empty mock config' ' file %s: %s' % (tmp_filename, error)) @@ -2174,7 +2174,7 @@ class Commands(object):
try: config_content = self.mock_config() - except rpkgError, error: + except rpkgError as error: self._cleanup_tmp_dir(my_config_dir) raise rpkgError('Could not generate config file: %s' % error) @@ -2182,7 +2182,7 @@ class Commands(object): config_file = os.path.join(config_dir, '%s.cfg' % root) try: open(config_file, 'wb').write(config_content) - except IOError, error: + except IOError as error: self._cleanup_tmp_dir(my_config_dir) raise rpkgError('Could not write config file: %s' % error)
@@ -2197,7 +2197,7 @@ class Commands(object): return try: shutil.rmtree(tmp_dir) - except OSError, error: + except OSError as error: if error.errno != errno.EEXIST: raise rpkgError('Failed to remove temporary directory' ' %s. Reason: %s.' % (tmp_dir, error)) @@ -2226,14 +2226,14 @@ class Commands(object): ' request koji to create new one.', chroot_cfg) try: config_dir = self._config_dir_basic(root=root) - except rpkgError, error: + except rpkgError as error: raise rpkgError('Failed to create mock config directory:' ' %s' % error) self.log.debug('Temporary mock config directory: %s', config_dir) try: self._config_dir_other(config_dir) - except rpkgError, error: + except rpkgError as error: self._cleanup_tmp_dir(config_dir) raise rpkgError('Failed to populate mock config directory:' ' %s' % error) diff --git a/src/pyrpkg/cli.py b/src/pyrpkg/cli.py index 3df0b59..76be061 100755 --- a/src/pyrpkg/cli.py +++ b/src/pyrpkg/cli.py @@ -999,7 +999,7 @@ defined, packages will be built sequentially.""" % {'name': self.name}) def mockbuild(self): try: self.cmd.sources() - except Exception, e: + except Exception as e: self.log.error('Could not download sources: %s' % e) sys.exit(1)
@@ -1020,14 +1020,14 @@ defined, packages will be built sequentially.""" % {'name': self.name}) try: self.cmd.mockbuild(mockargs, self.args.root, hashtype=self.args.hash) - except Exception, e: + except Exception as e: self.log.error('Could not run mockbuild: %s' % e) sys.exit(1)
def mock_config(self): try: print(self.cmd.mock_config(self.args.target, self.args.arch)) - except Exception, e: + except Exception as e: self.log.error('Could not generate the mock config: %s' % e) sys.exit(1)
@@ -1331,7 +1331,7 @@ class TaskWatcher(object): error = None try: result = self.session.getTaskResult(self.id) - except (xmlrpclib.Fault,koji.GenericError),e: + except (xmlrpclib.Fault, koji.GenericError) as e: error = e if error is None: # print "%s: complete" % self.str()
From: Mathieu Bridon bochecha@daitauha.fr
Nothing is using this method.
I checked in the following pyrpkg-based tools:
- rpkg - fedpkg - rhpkg - nbpkg - centpkg - rdopkg --- src/pyrpkg/__init__.py | 18 ------------------ 1 file changed, 18 deletions(-)
diff --git a/src/pyrpkg/__init__.py b/src/pyrpkg/__init__.py index 6538b76..910bb2e 100644 --- a/src/pyrpkg/__init__.py +++ b/src/pyrpkg/__init__.py @@ -1706,24 +1706,6 @@ class Commands(object): raise rpkgError("Error checking for %s at: %s" % (filename, self.lookaside_cgi))
- def upload_file(self, pkg_name, filepath, md5sum): - """ Upload a file to the lookaside cache. """ - - # Setup the POST data for lookaside CGI request. The use of - # 'file' here appears to trigger the actual upload: - post_data = [('name', pkg_name), - ('md5sum', md5sum), - ('file', (pycurl.FORM_FILE, filepath))] - - curl = self._create_curl() - curl.setopt(pycurl.HTTPPOST, post_data) - - try: - curl.perform() - except: - raise rpkgError('Lookaside failure.') - curl.close() - def build(self, skip_tag=False, scratch=False, background=False, url=None, chain=None, arches=None, sets=False, nvr_check=True): """Initiate a build of the module. Available options are:
From: Mathieu Bridon bochecha@daitauha.fr
This paves the way for changing the hash type in the future. --- src/pyrpkg/__init__.py | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/src/pyrpkg/__init__.py b/src/pyrpkg/__init__.py index 910bb2e..b32effc 100644 --- a/src/pyrpkg/__init__.py +++ b/src/pyrpkg/__init__.py @@ -913,7 +913,9 @@ class Commands(object):
cmd = ['curl', '--fail', '-o', '/dev/null', '--show-error', '--progress-bar', '-F', 'name=%s' % self.module_name, - '-F', 'md5sum=%s' % file_hash, '-F', 'file=@%s' % file] + '-F', '%ssum=%s' % (self.lookasidehash, file_hash), + '-F', 'file=@%s' % file] + if self.quiet: cmd.append('-s') cmd.append(self.lookaside_cgi) @@ -1662,7 +1664,7 @@ class Commands(object): raise rpkgError('Could not check out %s' % branch) return
- def file_exists(self, pkg_name, filename, md5sum): + def file_exists(self, pkg_name, filename, checksum): """ Return True if the given file exists in the lookaside cache, False if not. @@ -1679,7 +1681,7 @@ class Commands(object): # 'filename' here appears to be what differentiates this # request from an actual file upload. post_data = [('name', pkg_name), - ('md5sum', md5sum), + ('%ssum' % self.lookasidehash, checksum), ('filename', filename)]
curl = self._create_curl()
From: Mathieu Bridon bochecha@daitauha.fr
This will make it easier to reuse them. It also makes the main module a bit shorter, which can't be a bad thing.
And while I'm at it, I'm documenting all those error classes. --- src/pyrpkg/__init__.py | 12 ++---------- src/pyrpkg/errors.py | 34 ++++++++++++++++++++++++++++++++++ src/pyrpkg/sources.py | 14 ++------------ 3 files changed, 38 insertions(+), 22 deletions(-) create mode 100644 src/pyrpkg/errors.py
diff --git a/src/pyrpkg/__init__.py b/src/pyrpkg/__init__.py index b32effc..c6a13ca 100644 --- a/src/pyrpkg/__init__.py +++ b/src/pyrpkg/__init__.py @@ -38,16 +38,8 @@ try: except ImportError: pass
-from pyrpkg.sources import HashtypeMixingError, SourcesFile - - -# Define our own error class -class rpkgError(Exception): - faultCode = 1000 - -class rpkgAuthError(rpkgError): - """Raised when there is an error in authentication""" - faultCode = 1002 +from pyrpkg.errors import HashtypeMixingError, rpkgError, rpkgAuthError +from pyrpkg.sources import SourcesFile
# Setup our logger diff --git a/src/pyrpkg/errors.py b/src/pyrpkg/errors.py new file mode 100644 index 0000000..ec67bdb --- /dev/null +++ b/src/pyrpkg/errors.py @@ -0,0 +1,34 @@ +# Copyright (c) 2015 - Red Hat Inc. +# +# This program is free software; you can redistribute it and/or modify it +# under the terms of the GNU General Public License as published by the +# Free Software Foundation; either version 2 of the License, or (at your +# option) any later version. See http://www.gnu.org/copyleft/gpl.html for +# the full text of the license. + + +"""Custom error classes""" + + +class rpkgError(Exception): + """Our base error class""" + faultCode = 1000 + + +class rpkgAuthError(rpkgError): + """Raised in case of authentication errors""" + faultCode = 1002 + + +class HashtypeMixingError(rpkgError): + """Raised when we try to mix hash types in a sources file""" + def __init__(self, existing_hashtype, new_hashtype): + super(HashtypeMixingError, self).__init__() + + self.existing_hashtype = existing_hashtype + self.new_hashtype = new_hashtype + + +class MalformedLineError(rpkgError): + """Raised when parsing a sources file with malformed lines""" + pass diff --git a/src/pyrpkg/sources.py b/src/pyrpkg/sources.py index a3ff1b1..1486c53 100644 --- a/src/pyrpkg/sources.py +++ b/src/pyrpkg/sources.py @@ -19,23 +19,13 @@ entries, and write these entries to the file in the proper format. import os import re
+from .errors import HashtypeMixingError, MalformedLineError +
LINE_PATTERN = re.compile( r'^(?P<hashtype>[^ ]+?) ((?P<file>[^ )]+?)) = (?P<hash>[^ ]+?)$')
-class HashtypeMixingError(Exception): - def __init__(self, existing_hashtype, new_hashtype): - super(HashtypeMixingError, self).__init__() - - self.existing_hashtype = existing_hashtype - self.new_hashtype = new_hashtype - - -class MalformedLineError(Exception): - pass - - class SourcesFile(object): def __init__(self, sourcesfile, entry_type, replace=False): self.sourcesfile = sourcesfile
From: Mathieu Bridon bochecha@daitauha.fr
Using __name__ here means that submodule will all be able to use the same logging configuration. --- src/pyrpkg/__init__.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/pyrpkg/__init__.py b/src/pyrpkg/__init__.py index c6a13ca..9e706b4 100644 --- a/src/pyrpkg/__init__.py +++ b/src/pyrpkg/__init__.py @@ -52,7 +52,7 @@ class NullHandler(logging.Handler): h = NullHandler() # This is our log object, clients of this library can use this object to # define their own logging needs -log = logging.getLogger("rpkg") +log = logging.getLogger(__name__) # Add the null handler log.addHandler(h)
From: Mathieu Bridon bochecha@daitauha.fr
This is going to contain random bits and pieces of utilities.
Currently, it only contains a cached_property class, which is going to be useful to create properties which are only computed the first time they are called, but have their return value cached for future calls. --- src/pyrpkg/utils.py | 43 ++++++++++++++++++ test/test_utils.py | 123 ++++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 166 insertions(+) create mode 100644 src/pyrpkg/utils.py create mode 100644 test/test_utils.py
diff --git a/src/pyrpkg/utils.py b/src/pyrpkg/utils.py new file mode 100644 index 0000000..2d16ce0 --- /dev/null +++ b/src/pyrpkg/utils.py @@ -0,0 +1,43 @@ +# Copyright (c) 2015 - Red Hat Inc. +# +# This program is free software; you can redistribute it and/or modify it +# under the terms of the GNU General Public License as published by the +# Free Software Foundation; either version 2 of the License, or (at your +# option) any later version. See http://www.gnu.org/copyleft/gpl.html for +# the full text of the license. + + +"""Miscellaneous utilities + +This module contains a bunch of utilities used elsewhere in pyrpkg. +""" + + +class cached_property(property): + """A property caching its return value + + This is pretty much the same as a normal Python property, except that the + decorated function is called only once. Its return value is then saved, + subsequent calls will return it without executing the function any more. + + Example: + >>> class Foo(object): + ... @cached_property + ... def bar(self): + ... print("Executing Foo.bar...") + ... return 42 + ... + >>> f = Foo() + >>> f.bar + Executing Foo.bar... + 42 + >>> f.bar + 42 + """ + def __get__(self, inst, type=None): + try: + return getattr(inst, '_%s' % self.fget.__name__) + except AttributeError: + v = super(cached_property, self).__get__(inst, type) + setattr(inst, '_%s' % self.fget.__name__, v) + return v diff --git a/test/test_utils.py b/test/test_utils.py new file mode 100644 index 0000000..047dde5 --- /dev/null +++ b/test/test_utils.py @@ -0,0 +1,123 @@ +import os +import sys +import unittest + +old_path = list(sys.path) +src_path = os.path.join(os.path.dirname(os.path.abspath(__file__)), '../src') +sys.path.insert(0, src_path) +from pyrpkg.utils import cached_property +sys.path = old_path + + +class CachedPropertyTestCase(unittest.TestCase): + def test_computed_only_once(self): + class Foo(object): + @cached_property + def foo(self): + runs.append("run once") + return 42 + + runs = [] + + f = Foo() + self.assertEqual(len(runs), 0) + self.assertEqual(f.foo, 42) + self.assertEqual(len(runs), 1) + self.assertEqual(f.foo, 42) + self.assertEqual(len(runs), 1) + + def test_not_shared_between_properties(self): + class Foo(object): + @cached_property + def foo(self): + foo_runs.append("run once") + return 42 + + @cached_property + def bar(self): + bar_runs.append("run once") + return 43 + + foo_runs = [] + bar_runs = [] + + f = Foo() + self.assertEqual(len(foo_runs), 0) + self.assertEqual(f.foo, 42) + self.assertEqual(len(foo_runs), 1) + self.assertEqual(f.foo, 42) + self.assertEqual(len(foo_runs), 1) + + self.assertEqual(len(bar_runs), 0) + self.assertEqual(f.bar, 43) + self.assertEqual(len(bar_runs), 1) + self.assertEqual(f.bar, 43) + self.assertEqual(len(bar_runs), 1) + + def test_not_shared_between_instances(self): + class Foo(object): + @cached_property + def foo(self): + foo_runs.append("run once") + return 42 + + class Bar(object): + @cached_property + def foo(self): + bar_runs.append("run once") + return 43 + + foo_runs = [] + bar_runs = [] + + f = Foo() + self.assertEqual(len(foo_runs), 0) + self.assertEqual(f.foo, 42) + self.assertEqual(len(foo_runs), 1) + self.assertEqual(f.foo, 42) + self.assertEqual(len(foo_runs), 1) + + b = Bar() + self.assertEqual(len(bar_runs), 0) + self.assertEqual(b.foo, 43) + self.assertEqual(len(bar_runs), 1) + self.assertEqual(b.foo, 43) + self.assertEqual(len(bar_runs), 1) + + def test_not_shared_when_inheriting(self): + class Foo(object): + @cached_property + def foo(self): + foo_runs.append("run once") + return 42 + + class Bar(Foo): + @cached_property + def foo(self): + bar_runs.append("run once") + return 43 + + foo_runs = [] + bar_runs = [] + + b = Bar() + self.assertEqual(len(bar_runs), 0) + self.assertEqual(b.foo, 43) + self.assertEqual(len(bar_runs), 1) + self.assertEqual(b.foo, 43) + self.assertEqual(len(bar_runs), 1) + + f = Foo() + self.assertEqual(len(foo_runs), 0) + self.assertEqual(f.foo, 42) + self.assertEqual(len(foo_runs), 1) + self.assertEqual(f.foo, 42) + self.assertEqual(len(foo_runs), 1) + + bar_runs = [] + b = Bar() + self.assertEqual(len(bar_runs), 0) + self.assertEqual(b.foo, 43) + self.assertEqual(len(bar_runs), 1) + self.assertEqual(b.foo, 43) + self.assertEqual(len(bar_runs), 1)
On Wed, May 06, 2015 at 01:53:02PM +0200, Mathieu Bridon wrote:
From: Mathieu Bridon bochecha@daitauha.fr
This is going to contain random bits and pieces of utilities.
Currently, it only contains a cached_property class, which is going to be useful to create properties which are only computed the first time they are called, but have their return value cached for future calls.
src/pyrpkg/utils.py | 43 ++++++++++++++++++ test/test_utils.py | 123 ++++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 166 insertions(+) create mode 100644 src/pyrpkg/utils.py create mode 100644 test/test_utils.py
diff --git a/src/pyrpkg/utils.py b/src/pyrpkg/utils.py new file mode 100644 index 0000000..2d16ce0 --- /dev/null +++ b/src/pyrpkg/utils.py @@ -0,0 +1,43 @@ +# Copyright (c) 2015 - Red Hat Inc. +# +# This program is free software; you can redistribute it and/or modify it +# under the terms of the GNU General Public License as published by the +# Free Software Foundation; either version 2 of the License, or (at your +# option) any later version. See http://www.gnu.org/copyleft/gpl.html for +# the full text of the license.
+"""Miscellaneous utilities
+This module contains a bunch of utilities used elsewhere in pyrpkg. +"""
+class cached_property(property):
- """A property caching its return value
- This is pretty much the same as a normal Python property, except that the
- decorated function is called only once. Its return value is then saved,
- subsequent calls will return it without executing the function any more.
- Example:
>>> class Foo(object):
... @cached_property
... def bar(self):
... print("Executing Foo.bar...")
... return 42
...
>>> f = Foo()
>>> f.bar
Executing Foo.bar...
42
>>> f.bar
42
- """
- def __get__(self, inst, type=None):
try:
return getattr(inst, '_%s' % self.fget.__name__)
except AttributeError:
v = super(cached_property, self).__get__(inst, type)
setattr(inst, '_%s' % self.fget.__name__, v)
return v
diff --git a/test/test_utils.py b/test/test_utils.py new file mode 100644 index 0000000..047dde5 --- /dev/null +++ b/test/test_utils.py @@ -0,0 +1,123 @@ +import os +import sys +import unittest
+old_path = list(sys.path) +src_path = os.path.join(os.path.dirname(os.path.abspath(__file__)), '../src') +sys.path.insert(0, src_path) +from pyrpkg.utils import cached_property +sys.path = old_path
+class CachedPropertyTestCase(unittest.TestCase):
- def test_computed_only_once(self):
class Foo(object):
@cached_property
def foo(self):
runs.append("run once")
return 42
runs = []
f = Foo()
self.assertEqual(len(runs), 0)
self.assertEqual(f.foo, 42)
self.assertEqual(len(runs), 1)
self.assertEqual(f.foo, 42)
self.assertEqual(len(runs), 1)
- def test_not_shared_between_properties(self):
class Foo(object):
@cached_property
def foo(self):
foo_runs.append("run once")
return 42
@cached_property
def bar(self):
bar_runs.append("run once")
return 43
foo_runs = []
bar_runs = []
f = Foo()
self.assertEqual(len(foo_runs), 0)
self.assertEqual(f.foo, 42)
self.assertEqual(len(foo_runs), 1)
self.assertEqual(f.foo, 42)
self.assertEqual(len(foo_runs), 1)
self.assertEqual(len(bar_runs), 0)
self.assertEqual(f.bar, 43)
self.assertEqual(len(bar_runs), 1)
self.assertEqual(f.bar, 43)
self.assertEqual(len(bar_runs), 1)
- def test_not_shared_between_instances(self):
class Foo(object):
@cached_property
def foo(self):
foo_runs.append("run once")
return 42
class Bar(object):
@cached_property
def foo(self):
bar_runs.append("run once")
return 43
foo_runs = []
bar_runs = []
f = Foo()
self.assertEqual(len(foo_runs), 0)
self.assertEqual(f.foo, 42)
self.assertEqual(len(foo_runs), 1)
self.assertEqual(f.foo, 42)
self.assertEqual(len(foo_runs), 1)
b = Bar()
self.assertEqual(len(bar_runs), 0)
self.assertEqual(b.foo, 43)
self.assertEqual(len(bar_runs), 1)
self.assertEqual(b.foo, 43)
self.assertEqual(len(bar_runs), 1)
- def test_not_shared_when_inheriting(self):
class Foo(object):
@cached_property
def foo(self):
foo_runs.append("run once")
return 42
class Bar(Foo):
@cached_property
def foo(self):
bar_runs.append("run once")
return 43
foo_runs = []
bar_runs = []
b = Bar()
self.assertEqual(len(bar_runs), 0)
self.assertEqual(b.foo, 43)
self.assertEqual(len(bar_runs), 1)
self.assertEqual(b.foo, 43)
self.assertEqual(len(bar_runs), 1)
f = Foo()
self.assertEqual(len(foo_runs), 0)
self.assertEqual(f.foo, 42)
self.assertEqual(len(foo_runs), 1)
self.assertEqual(f.foo, 42)
self.assertEqual(len(foo_runs), 1)
bar_runs = []
b = Bar()
self.assertEqual(len(bar_runs), 0)
self.assertEqual(b.foo, 43)
self.assertEqual(len(bar_runs), 1)
self.assertEqual(b.foo, 43)
self.assertEqual(len(bar_runs), 1)
-- 2.1.0
Nice tests for this!
From: Mathieu Bridon bochecha@daitauha.fr
At the moment this doesn't do much: it only stores the 3 lookaside-related parameters we have (a download url, an upload url and a hash type), but eventually it will become a full-fledged lookaside cache client.
Using a property for it makes it trivial for downstreams to override it if they need to use their own implementation of a lookaside cache client.
And using a cached property means we won't reinitialize the CGILookasideCache object every time we access the Commands.lookasidecache attribute. --- src/pyrpkg/__init__.py | 16 ++++++++++++++++ src/pyrpkg/lookaside.py | 30 ++++++++++++++++++++++++++++++ 2 files changed, 46 insertions(+) create mode 100644 src/pyrpkg/lookaside.py
diff --git a/src/pyrpkg/__init__.py b/src/pyrpkg/__init__.py index 9e706b4..b06c68b 100644 --- a/src/pyrpkg/__init__.py +++ b/src/pyrpkg/__init__.py @@ -39,7 +39,9 @@ except ImportError: pass
from pyrpkg.errors import HashtypeMixingError, rpkgError, rpkgAuthError +from pyrpkg.lookaside import CGILookasideCache from pyrpkg.sources import SourcesFile +from pyrpkg.utils import cached_property
# Setup our logger @@ -166,6 +168,20 @@ class Commands(object): # that we can do clone actions without knowing things like the spec # file or rpm data.
+ @cached_property + def lookasidecache(self): + """A helper to interact with the lookaside cache + + This is a pyrpkg.lookaside.CGILookasideCache instance, providing all + the needed stuff to communicate with a Fedora-style lookaside cache. + + Downstream users of the pyrpkg API may override this property with + their own, returning their own implementation of a lookaside cache + helper object. + """ + return CGILookasideCache( + self.lookasidehash, self.lookaside, self.lookaside_cgi) + @property def path(self): return self._path diff --git a/src/pyrpkg/lookaside.py b/src/pyrpkg/lookaside.py new file mode 100644 index 0000000..f7d191e --- /dev/null +++ b/src/pyrpkg/lookaside.py @@ -0,0 +1,30 @@ +# Copyright (c) 2015 - Red Hat Inc. +# +# This program is free software; you can redistribute it and/or modify it +# under the terms of the GNU General Public License as published by the +# Free Software Foundation; either version 2 of the License, or (at your +# option) any later version. See http://www.gnu.org/copyleft/gpl.html for +# the full text of the license. + + +"""Interact with a lookaside cache + +This module contains everything needed to upload and download source files the +way it is done by Fedora, RHEL, and other distributions maintainers. +""" + + +class CGILookasideCache(object): + """A class to interact with a CGI-based lookaside cache""" + def __init__(self, hashtype, download_url, upload_url): + """Constructor + + Args: + hashtype (str): The hash algorithm to use for uploads. (e.g 'md5') + download_url (str): The URL used to download source files. + upload_url (str): The URL of the CGI script called when uploading + source files. + """ + self.hashtype = hashtype + self.download_url = download_url + self.upload_url = upload_url
From: Mathieu Bridon bochecha@daitauha.fr
This helps deprecate parts of the API properly, emitting a warning for API users.
We're going to start using it soon. --- src/pyrpkg/utils.py | 20 ++++++++++++++++++++ test/test_utils.py | 38 +++++++++++++++++++++++++++++++++++++- 2 files changed, 57 insertions(+), 1 deletion(-)
diff --git a/src/pyrpkg/utils.py b/src/pyrpkg/utils.py index 2d16ce0..1d95218 100644 --- a/src/pyrpkg/utils.py +++ b/src/pyrpkg/utils.py @@ -13,6 +13,10 @@ This module contains a bunch of utilities used elsewhere in pyrpkg. """
+import warnings +warnings.simplefilter('always', DeprecationWarning) + + class cached_property(property): """A property caching its return value
@@ -41,3 +45,19 @@ class cached_property(property): v = super(cached_property, self).__get__(inst, type) setattr(inst, '_%s' % self.fget.__name__, v) return v + + +def warn_deprecated(clsname, oldname, newname): + """Emit a deprecation warning + + Args: + clsname (str): The name of the class which has its attribute + deprecated. + oldname (str): The name of the deprecated attribute. + newname (str): The name of the new attribute, which should be used + instead. + """ + warnings.warn( + "%s.%s is deprecated and will be removed eventually.\n Please " + "use %s.%s instead." % (clsname, oldname, clsname, newname), + DeprecationWarning, stacklevel=3) diff --git a/test/test_utils.py b/test/test_utils.py index 047dde5..f9aa14a 100644 --- a/test/test_utils.py +++ b/test/test_utils.py @@ -1,11 +1,14 @@ import os import sys import unittest +import warnings + +import mock
old_path = list(sys.path) src_path = os.path.join(os.path.dirname(os.path.abspath(__file__)), '../src') sys.path.insert(0, src_path) -from pyrpkg.utils import cached_property +from pyrpkg.utils import cached_property, warn_deprecated sys.path = old_path
@@ -121,3 +124,36 @@ class CachedPropertyTestCase(unittest.TestCase): self.assertEqual(len(bar_runs), 1) self.assertEqual(b.foo, 43) self.assertEqual(len(bar_runs), 1) + + +class DeprecationUtilsTestCase(unittest.TestCase): + def setUp(self): + warnings.simplefilter('always', DeprecationWarning) + + @mock.patch('sys.stderr') + def test_warn_deprecated(self, mock_stderr): + class Foo(object): + def old_method(self): + warn_deprecated(self.__class__.__name__, 'old_method', + 'new_method') + return self.new_method() + + def new_method(self): + return "Yay!" + + def mock_write(msg): + written_lines.append(msg) + + written_lines = [] + mock_stderr.write.side_effect = mock_write + + foo = Foo() + self.assertEqual(foo.old_method(), foo.new_method()) + self.assertEqual(len(written_lines), 1) + self.assertTrue('DeprecationWarning' in written_lines[0]) + self.assertTrue('Foo.old_method' in written_lines[0]) + self.assertTrue('Foo.new_method' in written_lines[0]) + + warnings.simplefilter('error', DeprecationWarning) + self.assertRaises(DeprecationWarning, foo.old_method) + self.assertEqual(len(written_lines), 1)
From: Mathieu Bridon bochecha@daitauha.fr
The only time we need to compute the hash of a source file is when we interact with the lookaside cache, so it makes sense to have it there, and remove some clutter from the main module.
However, the Commands._hash_file might be used by downstream, so let's keep it for a while, make it use the new lookaside module, and emit a deprecation warning. --- src/pyrpkg/__init__.py | 27 ++++++------------------ src/pyrpkg/errors.py | 5 +++++ src/pyrpkg/lookaside.py | 34 ++++++++++++++++++++++++++++++ test/test_lookaside.py | 55 +++++++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 100 insertions(+), 21 deletions(-) create mode 100644 test/test_lookaside.py
diff --git a/src/pyrpkg/__init__.py b/src/pyrpkg/__init__.py index b06c68b..e369db5 100644 --- a/src/pyrpkg/__init__.py +++ b/src/pyrpkg/__init__.py @@ -41,7 +41,7 @@ except ImportError: from pyrpkg.errors import HashtypeMixingError, rpkgError, rpkgAuthError from pyrpkg.lookaside import CGILookasideCache from pyrpkg.sources import SourcesFile -from pyrpkg.utils import cached_property +from pyrpkg.utils import cached_property, warn_deprecated
# Setup our logger @@ -773,24 +773,9 @@ class Commands(object): return False
def _hash_file(self, file, hashtype): - """Return the hash of a file given a hash type""" - - try: - sum = hashlib.new(hashtype) - except ValueError: - raise rpkgError('Invalid hash type: %s' % hashtype) - - input = open(file, 'rb') - # Loop through the file reading chunks at a time as to not - # put the entire file in memory. That would suck for DVDs - while True: - chunk = input.read(8192) # magic number! Taking suggestions - if not chunk: - # we're done with the file - break - sum.update(chunk) - input.close() - return sum.hexdigest() + warn_deprecated(self.__class__.__name__, '_hash_file', + 'lookasidecache.hash_file') + return self.lookasidecache.hash_file(file, hashtype=hashtype)
def _run_command(self, cmd, shell=False, env=None, pipe=[], cwd=None): """Run the given command. @@ -901,7 +886,7 @@ class Commands(object): """
# get the hash - sum = self._hash_file(file, hashtype) + sum = self.lookasidecache.hash_file(file, hashtype=hashtype) # now do the comparison if sum == hash: return True @@ -2259,7 +2244,7 @@ class Commands(object): uploaded = [] for f in files: # TODO: Skip empty file needed? - file_hash = self._hash_file(f, self.lookasidehash) + file_hash = self.lookasidecache.hash_file(f) self.log.info("Uploading: %s %s" % (file_hash, f)) file_basename = os.path.basename(f)
diff --git a/src/pyrpkg/errors.py b/src/pyrpkg/errors.py index ec67bdb..e9d481b 100644 --- a/src/pyrpkg/errors.py +++ b/src/pyrpkg/errors.py @@ -32,3 +32,8 @@ class HashtypeMixingError(rpkgError): class MalformedLineError(rpkgError): """Raised when parsing a sources file with malformed lines""" pass + + +class InvalidHashType(rpkgError): + """Raised when we don't know the requested hash algorithm""" + pass diff --git a/src/pyrpkg/lookaside.py b/src/pyrpkg/lookaside.py index f7d191e..0a7e873 100644 --- a/src/pyrpkg/lookaside.py +++ b/src/pyrpkg/lookaside.py @@ -14,6 +14,11 @@ way it is done by Fedora, RHEL, and other distributions maintainers. """
+import hashlib + +from .errors import InvalidHashType + + class CGILookasideCache(object): """A class to interact with a CGI-based lookaside cache""" def __init__(self, hashtype, download_url, upload_url): @@ -28,3 +33,32 @@ class CGILookasideCache(object): self.hashtype = hashtype self.download_url = download_url self.upload_url = upload_url + + def hash_file(self, filename, hashtype=None): + """Compute the hash of a file + + Args: + filename (str): The full path to the file. It is assumed to exist. + hashtype (str, optional): The hash algorithm to use. (e.g 'md5') + This defaults to the hashtype passed to the constructor. + + Returns: + The hash digest. + """ + if hashtype is None: + hashtype = self.hashtype + + try: + sum = hashlib.new(hashtype) + + except ValueError: + raise InvalidHashType(hashtype) + + with open(filename, 'rb') as f: + chunk = f.read(8192) + + while chunk: + sum.update(chunk) + chunk = f.read(8192) + + return sum.hexdigest() diff --git a/test/test_lookaside.py b/test/test_lookaside.py new file mode 100644 index 0000000..aea886b --- /dev/null +++ b/test/test_lookaside.py @@ -0,0 +1,55 @@ +# Copyright (c) 2015 - Red Hat Inc. +# +# This program is free software; you can redistribute it and/or modify it +# under the terms of the GNU General Public License as published by the +# Free Software Foundation; either version 2 of the License, or (at your +# option) any later version. See http://www.gnu.org/copyleft/gpl.html for +# the full text of the license. + + +import os +import shutil +import sys +import tempfile +import unittest + +old_path = list(sys.path) +src_path = os.path.join(os.path.dirname(os.path.abspath(__file__)), '../src') +sys.path.insert(0, src_path) +from pyrpkg.lookaside import CGILookasideCache +from pyrpkg.errors import InvalidHashType +sys.path = old_path + + +class CGILookasideCacheTestCase(unittest.TestCase): + def setUp(self): + self.workdir = tempfile.mkdtemp(prefix='rpkg-tests.') + self.filename = os.path.join(self.workdir, self._testMethodName) + + def tearDown(self): + shutil.rmtree(self.workdir) + + def test_hash_file(self): + lc = CGILookasideCache('sha512', '_', '_') + + with open(self.filename, 'w') as f: + f.write('something') + + result = lc.hash_file(self.filename, 'md5') + self.assertEqual(result, '437b930db84b8079c2dd804a71936b5f') + + result = lc.hash_file(self.filename) + self.assertEqual(result, '983d43ddff6da90f6a5d3b6172446a1ffe228b803fe64fdd5dcfab5646078a896851fe82f623c9d6e5654b3d2f363a04ec17cfb62b607437a9c7c132d511e522') # nopep8 + + def test_hash_file_invalid_hash_type(self): + lc = CGILookasideCache('sha512', '_', '_') + self.assertRaises(InvalidHashType, lc.hash_file, '_', 'sha42') + + def test_hash_file_empty(self): + lc = CGILookasideCache('sha512', '_', '_') + + with open(self.filename, 'w') as f: + f.write('') + + result = lc.hash_file(self.filename, 'md5') + self.assertEqual(result, 'd41d8cd98f00b204e9800998ecf8427e')
From: Mathieu Bridon bochecha@daitauha.fr
The only time we need to verify whether a source file is valid is when we interact with the lookaside cache, so it makes sense to have it there, and remove some clutter from the main module.
However, the Commands._verify_file might be used by downstreams, so let's keep it for a while, make it use the new lookaside module, and emit a deprecation warning. --- src/pyrpkg/__init__.py | 20 ++++++-------------- src/pyrpkg/lookaside.py | 15 +++++++++++++++ test/test_lookaside.py | 11 +++++++++++ 3 files changed, 32 insertions(+), 14 deletions(-)
diff --git a/src/pyrpkg/__init__.py b/src/pyrpkg/__init__.py index e369db5..eff3551 100644 --- a/src/pyrpkg/__init__.py +++ b/src/pyrpkg/__init__.py @@ -879,18 +879,10 @@ class Commands(object): return
def _verify_file(self, file, hash, hashtype): - """Given a file, a hash of that file, and a hashtype, verify. - - Returns True if the file verifies, False otherwise - - """ - - # get the hash - sum = self.lookasidecache.hash_file(file, hashtype=hashtype) - # now do the comparison - if sum == hash: - return True - return False + warn_deprecated(self.__class__.__name__, '_verify_file', + 'lookasidecache.file_is_valid') + return self.lookasidecache.file_is_valid(file, hash, + hashtype=hashtype)
def _newer(self, file1, file2): """Compare the last modification time of the given files @@ -1590,7 +1582,7 @@ class Commands(object): # See if we already have a valid copy downloaded outfile = os.path.join(outdir, entry.file) if os.path.exists(outfile): - if self._verify_file(outfile, entry.hash, entry.hashtype): + if self.lookasidecache.file_is_valid(outfile, entry.hash, hashtype=entry.hashtype): continue self.log.info("Downloading %s" % (entry.file)) urled_file = entry.file.replace(' ', '%20') @@ -1604,7 +1596,7 @@ class Commands(object): command.append('-s') command.append(url) self._run_command(command) - if not self._verify_file(outfile, entry.hash, entry.hashtype): + if not self.lookasidecache.file_is_valid(outfile, entry.hash, hashtype=entry.hashtype): raise rpkgError('%s failed checksum' % entry.file) return
diff --git a/src/pyrpkg/lookaside.py b/src/pyrpkg/lookaside.py index 0a7e873..e3c1bac 100644 --- a/src/pyrpkg/lookaside.py +++ b/src/pyrpkg/lookaside.py @@ -62,3 +62,18 @@ class CGILookasideCache(object): chunk = f.read(8192)
return sum.hexdigest() + + def file_is_valid(self, filename, hash, hashtype=None): + """Ensure the file is correct + + Args: + filename (str): The full path to the file. It is assumed to exist. + hash (str): The known good hash of the file. + hashtype (str, optional): The hash algorithm to use. (e.g 'md5') + This defaults to the hashtype passed to the constructor. + + Returns: + True if the file is valid, False otherwise. + """ + sum = self.hash_file(filename, hashtype) + return sum == hash diff --git a/test/test_lookaside.py b/test/test_lookaside.py index aea886b..4a4f067 100644 --- a/test/test_lookaside.py +++ b/test/test_lookaside.py @@ -53,3 +53,14 @@ class CGILookasideCacheTestCase(unittest.TestCase):
result = lc.hash_file(self.filename, 'md5') self.assertEqual(result, 'd41d8cd98f00b204e9800998ecf8427e') + + def test_file_is_valid(self): + lc = CGILookasideCache('md5', '_', '_') + + with open(self.filename, 'w') as f: + f.write('something') + + self.assertTrue(lc.file_is_valid(self.filename, + '437b930db84b8079c2dd804a71936b5f')) + self.assertFalse(lc.file_is_valid(self.filename, 'not the right hash', + hashtype='sha512'))
From: Mathieu Bridon bochecha@daitauha.fr
This is quite a large chunk of code that is moving from pyrpkg.Commands to the new pyrpkg.lookaside.CGILookasideCache.
In addition, we now don't fork a call to the curl command any more, but instead use pycurl. This allows the download method to be entirely tested.
Downstreams can now also easily use their own URL format, which should make the CentOS folks happier, as well as make it easier when we want to add the hashtype to the URL in Fedora. (as part of the move away from md5 hashes)
As part of the move to pycurl, we are losing the download progress indication, but it will make a come-back in a future commit. --- src/pyrpkg/__init__.py | 23 ++------- src/pyrpkg/errors.py | 5 ++ src/pyrpkg/lookaside.py | 63 +++++++++++++++++++++++- test/test_lookaside.py | 124 +++++++++++++++++++++++++++++++++++++++++++++++- 4 files changed, 193 insertions(+), 22 deletions(-)
diff --git a/src/pyrpkg/__init__.py b/src/pyrpkg/__init__.py index eff3551..ce262e6 100644 --- a/src/pyrpkg/__init__.py +++ b/src/pyrpkg/__init__.py @@ -20,7 +20,6 @@ if sys.version_info[0:2] >= (2, 5): else: # We need a subprocess that has check_call from kitchen.pycompat27 import subprocess -import hashlib import koji import rpm import logging @@ -1579,26 +1578,10 @@ class Commands(object): sourcesf = SourcesFile(self.sources_filename, self.source_entry_type)
for entry in sourcesf.entries: - # See if we already have a valid copy downloaded outfile = os.path.join(outdir, entry.file) - if os.path.exists(outfile): - if self.lookasidecache.file_is_valid(outfile, entry.hash, hashtype=entry.hashtype): - continue - self.log.info("Downloading %s" % (entry.file)) - urled_file = entry.file.replace(' ', '%20') - url = '%s/%s/%s/%s/%s' % (self.lookaside, self.module_name, - urled_file, entry.hash, urled_file) - # These options came from Makefile.common. - # Probably need to support wget as well - command = ['curl', '-H', 'Pragma:', '-o', outfile, '-R', '-S', - '--fail'] - if self.quiet: - command.append('-s') - command.append(url) - self._run_command(command) - if not self.lookasidecache.file_is_valid(outfile, entry.hash, hashtype=entry.hashtype): - raise rpkgError('%s failed checksum' % entry.file) - return + self.lookasidecache.download( + self.module_name, entry.file, entry.hash, outfile, + hashtype=entry.hashtype)
def switch_branch(self, branch, fetch=True): """Switch the working branch diff --git a/src/pyrpkg/errors.py b/src/pyrpkg/errors.py index e9d481b..9547702 100644 --- a/src/pyrpkg/errors.py +++ b/src/pyrpkg/errors.py @@ -37,3 +37,8 @@ class MalformedLineError(rpkgError): class InvalidHashType(rpkgError): """Raised when we don't know the requested hash algorithm""" pass + + +class DownloadError(rpkgError): + """Raised when something went wrong during a download""" + pass diff --git a/src/pyrpkg/lookaside.py b/src/pyrpkg/lookaside.py index e3c1bac..1108fcf 100644 --- a/src/pyrpkg/lookaside.py +++ b/src/pyrpkg/lookaside.py @@ -15,8 +15,13 @@ way it is done by Fedora, RHEL, and other distributions maintainers.
import hashlib +import logging +import os +import sys
-from .errors import InvalidHashType +import pycurl + +from .errors import DownloadError, InvalidHashType
class CGILookasideCache(object): @@ -34,6 +39,10 @@ class CGILookasideCache(object): self.download_url = download_url self.upload_url = upload_url
+ self.log = logging.getLogger(__name__) + + self.download_path = '%(name)s/%(filename)s/%(hash)s/%(filename)s' + def hash_file(self, filename, hashtype=None): """Compute the hash of a file
@@ -77,3 +86,55 @@ class CGILookasideCache(object): """ sum = self.hash_file(filename, hashtype) return sum == hash + + def download(self, name, filename, hash, outfile, hashtype=None): + """Download a source file + + Args: + name (str): The name of the module. (usually the name of the SRPM) + filename (str): The name of the file to download. + hash (str): The known good hash of the file. + outfile (str): The full path where to save the downloaded file. + hashtype (str, optional): The hash algorithm. (e.g 'md5') + This defaults to the hashtype passed to the constructor. + """ + if hashtype is None: + hashtype = self.hashtype + + if os.path.exists(outfile): + if self.file_is_valid(outfile, hash, hashtype=hashtype): + return + + self.log.info("Downloading %s", filename) + urled_file = filename.replace(' ', '%20') + + path_dict = {'name': name, 'filename': urled_file, 'hash': hash} + path = self.download_path % path_dict + url = '%s/%s' % (self.download_url, path) + self.log.debug("Full url: %s" % url) + + with open(outfile, 'wb') as f: + c = pycurl.Curl() + c.setopt(pycurl.URL, url) + c.setopt(pycurl.HTTPHEADER, ['Pragma:']) + c.setopt(pycurl.OPT_FILETIME, True) + c.setopt(pycurl.WRITEDATA, f) + + try: + c.perform() + tstamp = c.getinfo(pycurl.INFO_FILETIME) + status = c.getinfo(pycurl.RESPONSE_CODE) + + except Exception as e: + raise DownloadError(e) + + finally: + c.close() + + if status != 200: + raise DownloadError('Server returned status code %d' % status) + + os.utime(outfile, (tstamp, tstamp)) + + if not self.file_is_valid(outfile, hash, hashtype=hashtype): + raise DownloadError('%s failed checksum' % filename) diff --git a/test/test_lookaside.py b/test/test_lookaside.py index 4a4f067..89a34cd 100644 --- a/test/test_lookaside.py +++ b/test/test_lookaside.py @@ -7,17 +7,21 @@ # the full text of the license.
+import hashlib import os import shutil import sys import tempfile import unittest
+import mock +import pycurl + old_path = list(sys.path) src_path = os.path.join(os.path.dirname(os.path.abspath(__file__)), '../src') sys.path.insert(0, src_path) from pyrpkg.lookaside import CGILookasideCache -from pyrpkg.errors import InvalidHashType +from pyrpkg.errors import DownloadError, InvalidHashType sys.path = old_path
@@ -64,3 +68,121 @@ class CGILookasideCacheTestCase(unittest.TestCase): '437b930db84b8079c2dd804a71936b5f')) self.assertFalse(lc.file_is_valid(self.filename, 'not the right hash', hashtype='sha512')) + + @mock.patch('pyrpkg.lookaside.pycurl.Curl') + def test_download(self, mock_curl): + def mock_getinfo(info): + return 200 if info == pycurl.RESPONSE_CODE else 0 + + def mock_perform(): + with open(self.filename, 'rb') as f: + curlopts[pycurl.WRITEDATA].write(f.read()) + + def mock_setopt(opt, value): + curlopts[opt] = value + + curlopts = {} + curl = mock_curl.return_value + curl.getinfo.side_effect = mock_getinfo + curl.perform.side_effect = mock_perform + curl.setopt.side_effect = mock_setopt + + with open(self.filename, 'wb') as f: + f.write(b'content') + + name = 'pyrpkg' + filename = 'pyrpkg-0.0.tar.xz' + hash = hashlib.sha512(b'content').hexdigest() + outfile = os.path.join(self.workdir, 'pyrpkg-0.0.tar.xz') + full_url = 'http://example.com/%s/%s/%s/%s' % (name, filename, hash, + filename) + + lc = CGILookasideCache('sha512', 'http://example.com', '_') + lc.download(name, filename, hash, outfile, hashtype='sha512') + self.assertEqual(curl.perform.call_count, 1) + self.assertEqual(curlopts[pycurl.URL], full_url) + self.assertEqual(os.path.getmtime(outfile), 0) + + with open(outfile) as f: + self.assertEqual(f.read(), 'content') + + # Try a second time + lc.download(name, filename, hash, outfile) + self.assertEqual(curl.perform.call_count, 1) + + # Try a third time + os.remove(outfile) + lc.download(name, filename, hash, outfile) + self.assertEqual(curl.perform.call_count, 2) + + @mock.patch('pyrpkg.lookaside.pycurl.Curl') + def test_download_corrupted(self, mock_curl): + def mock_getinfo(info): + return 200 if info == pycurl.RESPONSE_CODE else 0 + + def mock_perform(): + with open(self.filename) as f: + curlopts[pycurl.WRITEDATA].write(f.read()) + + def mock_setopt(opt, value): + curlopts[opt] = value + + curlopts = {} + curl = mock_curl.return_value + curl.getinfo.side_effect = mock_getinfo + curl.perform.side_effect = mock_perform + curl.setopt.side_effect = mock_setopt + + with open(self.filename, 'wb') as f: + f.write(b'content') + + hash = "not the right hash" + outfile = os.path.join(self.workdir, 'pyrpkg-0.0.tar.xz') + + lc = CGILookasideCache('sha512', 'http://example.com', '_') + self.assertRaises(DownloadError, lc.download, 'pyrpkg', + 'pyrpkg-0.0.tar.xz', hash, outfile) + + @mock.patch('pyrpkg.lookaside.pycurl.Curl') + def test_download_failed(self, mock_curl): + curl = mock_curl.return_value + curl.perform.side_effect = Exception( + 'Could not resolve host: example.com') + + with open(self.filename, 'wb') as f: + f.write(b'content') + + hash = hashlib.sha512(b'content').hexdigest() + outfile = os.path.join(self.workdir, 'pyrpkg-0.0.tar.xz') + + lc = CGILookasideCache('sha512', 'http://example.com', '_') + self.assertRaises(DownloadError, lc.download, 'pyrpkg', + 'pyrpkg-0.0.tar.xz', hash, outfile) + + @mock.patch('pyrpkg.lookaside.pycurl.Curl') + def test_download_failed_status_code(self, mock_curl): + def mock_getinfo(info): + return 500 if info == pycurl.RESPONSE_CODE else 0 + + def mock_perform(): + with open(self.filename) as f: + curlopts[pycurl.WRITEDATA].write(f.read()) + + def mock_setopt(opt, value): + curlopts[opt] = value + + curlopts = {} + curl = mock_curl.return_value + curl.getinfo.side_effect = mock_getinfo + curl.perform.side_effect = mock_perform + curl.setopt.side_effect = mock_setopt + + with open(self.filename, 'wb') as f: + f.write(b'content') + + hash = hashlib.sha512(b'content').hexdigest() + outfile = os.path.join(self.workdir, 'pyrpkg-0.0.tar.xz') + + lc = CGILookasideCache('sha512', 'http://example.com', '_') + self.assertRaises(DownloadError, lc.download, 'pyrpkg', + 'pyrpkg-0.0.tar.xz', hash, outfile)
From: Mathieu Bridon bochecha@daitauha.fr
Due to the way pycurl handles progress callback, this handles both upload and download progress.
However, we only use it for download at the moment. --- src/pyrpkg/lookaside.py | 24 +++++++++++++++ test/test_lookaside.py | 81 +++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 105 insertions(+)
diff --git a/src/pyrpkg/lookaside.py b/src/pyrpkg/lookaside.py index 1108fcf..8badc73 100644 --- a/src/pyrpkg/lookaside.py +++ b/src/pyrpkg/lookaside.py @@ -43,6 +43,24 @@ class CGILookasideCache(object):
self.download_path = '%(name)s/%(filename)s/%(hash)s/%(filename)s'
+ def print_progress(self, to_download, downloaded, to_upload, uploaded): + if to_download > 0: + done = downloaded / to_download + + elif to_upload > 0: + done = uploaded / to_upload + + else: + return + + done_chars = int(done * 72) + remain_chars = 72 - done_chars + done = int(done * 1000) / 10.0 + + p = "\r%s%s %s%%" % ("#" * done_chars, " " * remain_chars, done) + sys.stdout.write(p) + sys.stdout.flush() + def hash_file(self, filename, hashtype=None): """Compute the hash of a file
@@ -117,6 +135,8 @@ class CGILookasideCache(object): c = pycurl.Curl() c.setopt(pycurl.URL, url) c.setopt(pycurl.HTTPHEADER, ['Pragma:']) + c.setopt(pycurl.NOPROGRESS, False) + c.setopt(pycurl.PROGRESSFUNCTION, self.print_progress) c.setopt(pycurl.OPT_FILETIME, True) c.setopt(pycurl.WRITEDATA, f)
@@ -131,6 +151,10 @@ class CGILookasideCache(object): finally: c.close()
+ # Get back a new line, after displaying the download progress + sys.stdout.write('\n') + sys.stdout.flush() + if status != 200: raise DownloadError('Server returned status code %d' % status)
diff --git a/test/test_lookaside.py b/test/test_lookaside.py index 89a34cd..7fe1743 100644 --- a/test/test_lookaside.py +++ b/test/test_lookaside.py @@ -186,3 +186,84 @@ class CGILookasideCacheTestCase(unittest.TestCase): lc = CGILookasideCache('sha512', 'http://example.com', '_') self.assertRaises(DownloadError, lc.download, 'pyrpkg', 'pyrpkg-0.0.tar.xz', hash, outfile) + + @mock.patch('pyrpkg.lookaside.sys.stdout') + def test_print_download_progress(self, mock_stdout): + def mock_write(msg): + written_lines.append(msg) + + written_lines = [] + expected_lines = [ + '\r################## 25.0%', # nopep8 + '\r#################################### 50.0%', # nopep8 + '\r###################################################### 75.0%', # nopep8 + '\r######################################################################## 100.0%', # nopep8 + ] + + mock_stdout.write.side_effect = mock_write + + lc = CGILookasideCache('_', '_', '_') + lc.print_progress(2000.0, 500.0, 0.0, 0.0) + self.assertEqual(mock_stdout.write.call_count, 1) + self.assertEqual(len(written_lines), 1) + + lc.print_progress(2000.0, 1000.0, 0.0, 0.0) + self.assertEqual(mock_stdout.write.call_count, 2) + self.assertEqual(len(written_lines), 2) + + lc.print_progress(2000.0, 1500.0, 0.0, 0.0) + self.assertEqual(mock_stdout.write.call_count, 3) + self.assertEqual(len(written_lines), 3) + + lc.print_progress(2000.0, 2000.0, 0.0, 0.0) + self.assertEqual(mock_stdout.write.call_count, 4) + self.assertEqual(len(written_lines), 4) + + self.assertEqual(written_lines, expected_lines) + + @mock.patch('pyrpkg.lookaside.sys.stdout') + def test_print_upload_progress(self, mock_stdout): + def mock_write(msg): + written_lines.append(msg) + + written_lines = [] + expected_lines = [ + '\r################## 25.0%', # nopep8 + '\r#################################### 50.0%', # nopep8 + '\r###################################################### 75.0%', # nopep8 + '\r######################################################################## 100.0%', # nopep8 + ] + + mock_stdout.write.side_effect = mock_write + + lc = CGILookasideCache('_', '_', '_') + lc.print_progress(0.0, 0.0, 2000.0, 500.0) + self.assertEqual(mock_stdout.write.call_count, 1) + self.assertEqual(len(written_lines), 1) + + lc.print_progress(0.0, 0.0, 2000.0, 1000.0) + self.assertEqual(mock_stdout.write.call_count, 2) + self.assertEqual(len(written_lines), 2) + + lc.print_progress(0.0, 0.0, 2000.0, 1500.0) + self.assertEqual(mock_stdout.write.call_count, 3) + self.assertEqual(len(written_lines), 3) + + lc.print_progress(0.0, 0.0, 2000.0, 2000.0) + self.assertEqual(mock_stdout.write.call_count, 4) + self.assertEqual(len(written_lines), 4) + + self.assertEqual(written_lines, expected_lines) + + @mock.patch('pyrpkg.lookaside.sys.stdout') + def test_print_no_progress(self, mock_stdout): + def mock_write(msg): + written_lines.append(msg) + + written_lines = [] + + mock_stdout.write.side_effect = mock_write + + lc = CGILookasideCache('_', '_', '_') + lc.print_progress(0.0, 0.0, 0.0, 0.0) + self.assertEqual(len(written_lines), 0)
From: Mathieu Bridon bochecha@daitauha.fr
In Fedora, we eventually want to move away from MD5 hashes, and as part of this effort we would like to have the hashtype as part of the download URL.
This makes it trivial, all we'd have to do is to somehow change the download_path attribute. (either changing it in pyrpkg, or overriding in a subclass in fedpkg) --- src/pyrpkg/lookaside.py | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/pyrpkg/lookaside.py b/src/pyrpkg/lookaside.py index 8badc73..14d0640 100644 --- a/src/pyrpkg/lookaside.py +++ b/src/pyrpkg/lookaside.py @@ -126,7 +126,8 @@ class CGILookasideCache(object): self.log.info("Downloading %s", filename) urled_file = filename.replace(' ', '%20')
- path_dict = {'name': name, 'filename': urled_file, 'hash': hash} + path_dict = {'name': name, 'filename': urled_file, 'hash': hash, + 'hashtype': hashtype} path = self.download_path % path_dict url = '%s/%s' % (self.download_url, path) self.log.debug("Full url: %s" % url)
From: Mathieu Bridon bochecha@daitauha.fr
We're never going to know in advance everything our downstreams are going to need as part of their download URL.
However, we already know that one of them, CentOS, needs the git branch in there.
We could add the git branch as an argument to the download function, and use it for the string interpolation.
However, someone would soon come and need something else.
This commit provides a much more generic way, which could work for all downstreams. All they'd have to do is override pyrpkg.Commands.sources to pass their own keyword arguments to pyrpkg.lookaside.CGILookasideCache.download.
We could even add them to pyrpkg if they make sense to have them shared.
And in fact, since we already know that CentOS wants the git branch, this commit also adds it, which should make them much happier, as that should make it much easier for them to reuse more code from pyrpkg. --- src/pyrpkg/__init__.py | 2 +- src/pyrpkg/lookaside.py | 5 ++++- test/test_lookaside.py | 42 ++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 47 insertions(+), 2 deletions(-)
diff --git a/src/pyrpkg/__init__.py b/src/pyrpkg/__init__.py index ce262e6..c98e3e6 100644 --- a/src/pyrpkg/__init__.py +++ b/src/pyrpkg/__init__.py @@ -1581,7 +1581,7 @@ class Commands(object): outfile = os.path.join(outdir, entry.file) self.lookasidecache.download( self.module_name, entry.file, entry.hash, outfile, - hashtype=entry.hashtype) + hashtype=entry.hashtype, branch=self.branch_merge)
def switch_branch(self, branch, fetch=True): """Switch the working branch diff --git a/src/pyrpkg/lookaside.py b/src/pyrpkg/lookaside.py index 14d0640..a1c17cb 100644 --- a/src/pyrpkg/lookaside.py +++ b/src/pyrpkg/lookaside.py @@ -105,7 +105,7 @@ class CGILookasideCache(object): sum = self.hash_file(filename, hashtype) return sum == hash
- def download(self, name, filename, hash, outfile, hashtype=None): + def download(self, name, filename, hash, outfile, hashtype=None, **kwargs): """Download a source file
Args: @@ -115,6 +115,8 @@ class CGILookasideCache(object): outfile (str): The full path where to save the downloaded file. hashtype (str, optional): The hash algorithm. (e.g 'md5') This defaults to the hashtype passed to the constructor. + **kwargs: Additional keyword arguments. They will be used when + contructing the full URL to the file to download. """ if hashtype is None: hashtype = self.hashtype @@ -128,6 +130,7 @@ class CGILookasideCache(object):
path_dict = {'name': name, 'filename': urled_file, 'hash': hash, 'hashtype': hashtype} + path_dict.update(kwargs) path = self.download_path % path_dict url = '%s/%s' % (self.download_url, path) self.log.debug("Full url: %s" % url) diff --git a/test/test_lookaside.py b/test/test_lookaside.py index 7fe1743..4d987c1 100644 --- a/test/test_lookaside.py +++ b/test/test_lookaside.py @@ -116,6 +116,48 @@ class CGILookasideCacheTestCase(unittest.TestCase): self.assertEqual(curl.perform.call_count, 2)
@mock.patch('pyrpkg.lookaside.pycurl.Curl') + def test_download_kwargs(self, mock_curl): + def mock_getinfo(info): + return 200 if info == pycurl.RESPONSE_CODE else 0 + + def mock_perform(): + with open(self.filename, 'rb') as f: + curlopts[pycurl.WRITEDATA].write(f.read()) + + def mock_setopt(opt, value): + curlopts[opt] = value + + curlopts = {} + curl = mock_curl.return_value + curl.getinfo.side_effect = mock_getinfo + curl.perform.side_effect = mock_perform + curl.setopt.side_effect = mock_setopt + + with open(self.filename, 'wb') as f: + f.write(b'content') + + name = 'pyrpkg' + filename = 'pyrpkg-0.0.tar.xz' + branch = 'f22' + hash = hashlib.sha512(b'content').hexdigest() + outfile = os.path.join(self.workdir, 'pyrpkg-0.0.tar.xz') + + path = '%(name)s/%(filename)s/%(branch)s/%(hashtype)s/%(hash)s' + full_url = 'http://example.com/%s' % ( + path % {'name': name, 'filename': filename, 'branch': branch, + 'hashtype': 'sha512', 'hash': hash}) + + lc = CGILookasideCache('sha512', 'http://example.com', '_') + + # Modify the download path, to try arbitrary kwargs + lc.download_path = path + + lc.download(name, filename, hash, outfile, hashtype='sha512', + branch=branch) + self.assertEqual(curl.perform.call_count, 1) + self.assertEqual(curlopts[pycurl.URL], full_url) + + @mock.patch('pyrpkg.lookaside.pycurl.Curl') def test_download_corrupted(self, mock_curl): def mock_getinfo(info): return 200 if info == pycurl.RESPONSE_CODE else 0
From: Mathieu Bridon bochecha@daitauha.fr
Some downstreams, most notably fedpkg, interact with a lookaside cache which uses a self-signed certificate.
Some downstreams, still fedpkg, require an authentication for uploading source files to the lookaside cache, based on a client-side certificate.
This commit makes it easier on this downstreams to reuse our CGILookasideCache implementation.
All they need to do is define a cert_file and a ca_cert properties on their pyrpkg.Commands subclass, and things are going to work automatically.
And in fact, this is already what fedpkg does, so we're not breaking anything. --- src/pyrpkg/__init__.py | 26 +++++++++++++++++++++++++- src/pyrpkg/lookaside.py | 12 +++++++++++- 2 files changed, 36 insertions(+), 2 deletions(-)
diff --git a/src/pyrpkg/__init__.py b/src/pyrpkg/__init__.py index c98e3e6..6d818dd 100644 --- a/src/pyrpkg/__init__.py +++ b/src/pyrpkg/__init__.py @@ -179,7 +179,8 @@ class Commands(object): helper object. """ return CGILookasideCache( - self.lookasidehash, self.lookaside, self.lookaside_cgi) + self.lookasidehash, self.lookaside, self.lookaside_cgi, + client_cert=self.cert_file, ca_cert=self.ca_cert)
@property def path(self): @@ -748,6 +749,29 @@ class Commands(object): def sources_filename(self): return os.path.join(self.path, 'sources')
+ @property + def cert_file(self): + """A client-side certificate for SSL authentication + + Downstream users of the pyrpkg API should override this property if + they actually need to use a client-side certificate. + + This defaults to None, which means no client-side certificate is used. + """ + return None + + @property + def ca_cert(self): + """A CA certificate to authenticate the server in SSL connections + + Downstream users of the pyrpkg API should override this property if + they actually need to use a CA certificate, usually because their + lookaside cache is using HTTPS with a self-signed certificate. + + This defaults to None, which means the system CA bundle is used. + """ + return None + # Define some helper functions, they start with _ def _create_curl(self): """ diff --git a/src/pyrpkg/lookaside.py b/src/pyrpkg/lookaside.py index a1c17cb..e455c40 100644 --- a/src/pyrpkg/lookaside.py +++ b/src/pyrpkg/lookaside.py @@ -26,7 +26,8 @@ from .errors import DownloadError, InvalidHashType
class CGILookasideCache(object): """A class to interact with a CGI-based lookaside cache""" - def __init__(self, hashtype, download_url, upload_url): + def __init__(self, hashtype, download_url, upload_url, + client_cert=None, ca_cert=None): """Constructor
Args: @@ -34,10 +35,19 @@ class CGILookasideCache(object): download_url (str): The URL used to download source files. upload_url (str): The URL of the CGI script called when uploading source files. + client_cert (str, optional): The full path to the client-side + certificate to use for HTTPS authentication. It defaults to + None, in which case no client-side certificate is used. + ca_cert (str, optional): The full path to the CA certificate to + use for HTTPS connexions. (e.g if the server certificate is + self-signed. It defaults to None, in which case the system CA + bundle is used. """ self.hashtype = hashtype self.download_url = download_url self.upload_url = upload_url + self.client_cert = client_cert + self.ca_cert = ca_cert
self.log = logging.getLogger(__name__)
From: Mathieu Bridon bochecha@daitauha.fr
This is one more chunk of code which gets moved from pyrpkg.Commands to pyrpkg.lookaside.CGILookasideCache, where it gets entirely tested.
It also gains some better error handling, as well as the capacity to use a client-side certificate and/or a custom CA certificate: none are used by default in pyrpkg, but a client like fedpkg automatically benefits from this, without needing to do anything on its own.
The Commands._create_curl method is now completely unused, and is marked as deprecated. However its implementation code is maintained, in case downstreams were actually using it. --- src/pyrpkg/__init__.py | 58 ++++--------------- src/pyrpkg/errors.py | 5 ++ src/pyrpkg/lookaside.py | 62 +++++++++++++++++++- test/test_lookaside.py | 147 +++++++++++++++++++++++++++++++++++++++++++++++- 4 files changed, 222 insertions(+), 50 deletions(-)
diff --git a/src/pyrpkg/__init__.py b/src/pyrpkg/__init__.py index 6d818dd..14eb808 100644 --- a/src/pyrpkg/__init__.py +++ b/src/pyrpkg/__init__.py @@ -26,7 +26,6 @@ import logging import git import ConfigParser import stat -import StringIO import tempfile import fnmatch import urlparse @@ -774,11 +773,10 @@ class Commands(object):
# Define some helper functions, they start with _ def _create_curl(self): - """ - Common curl setup options used for all requests to lookaside. - """ - curl = pycurl.Curl() + warn_deprecated(self, '_create_curl', + 'lookasidecache.remote_file_exists')
+ curl = pycurl.Curl() curl.setopt(pycurl.URL, self.lookaside_cgi)
return curl @@ -1657,48 +1655,10 @@ class Commands(object): return
def file_exists(self, pkg_name, filename, checksum): - """ - Return True if the given file exists in the lookaside cache, False - if not. - - A rpkgError will be thrown if the request looks bad or something - goes wrong. (i.e. the lookaside URL cannot be reached, or the package - named does not exist) - """ - - # String buffer, used to receive output from the curl request: - buf = StringIO.StringIO() - - # Setup the POST data for lookaside CGI request. The use of - # 'filename' here appears to be what differentiates this - # request from an actual file upload. - post_data = [('name', pkg_name), - ('%ssum' % self.lookasidehash, checksum), - ('filename', filename)] - - curl = self._create_curl() - curl.setopt(pycurl.WRITEFUNCTION, buf.write) - curl.setopt(pycurl.HTTPPOST, post_data) - - try: - curl.perform() - except Exception, e: - raise rpkgError('Lookaside failure: %s' % e) - curl.close() - output = buf.getvalue().strip() - - # Lookaside CGI script returns these strings depending on whether - # or not the file exists: - if output == "Available": - return True - if output == "Missing": - return False - - # Something unexpected happened, will trigger if the lookaside URL - # cannot be reached, the package named does not exist, and probably - # some other scenarios as well. - raise rpkgError("Error checking for %s at: %s" - % (filename, self.lookaside_cgi)) + warn_deprecated(self, '_verify_file', + 'lookasidecache.remote_file_exists') + return self.lookasidecache.remote_file_exists(pkg_name, filename, + checksum)
def build(self, skip_tag=False, scratch=False, background=False, url=None, chain=None, arches=None, sets=False, nvr_check=True): @@ -2268,9 +2228,11 @@ class Commands(object): if not gitignore.match(file_basename): gitignore.add('/%s' % file_basename)
- if self.file_exists(self.module_name, file_basename, file_hash): + if self.lookasidecache.remote_file_exists( + self.module_name, file_basename, file_hash): # Already uploaded, skip it: self.log.info("File already uploaded: %s" % file_basename) + else: # Ensure the new file is readable: os.chmod(f, stat.S_IRUSR | stat.S_IRGRP | stat.S_IROTH) diff --git a/src/pyrpkg/errors.py b/src/pyrpkg/errors.py index 9547702..5232f63 100644 --- a/src/pyrpkg/errors.py +++ b/src/pyrpkg/errors.py @@ -42,3 +42,8 @@ class InvalidHashType(rpkgError): class DownloadError(rpkgError): """Raised when something went wrong during a download""" pass + + +class UploadError(rpkgError): + """Raised when something went wrong during an upload""" + pass diff --git a/src/pyrpkg/lookaside.py b/src/pyrpkg/lookaside.py index e455c40..240bd6a 100644 --- a/src/pyrpkg/lookaside.py +++ b/src/pyrpkg/lookaside.py @@ -15,13 +15,14 @@ way it is done by Fedora, RHEL, and other distributions maintainers.
import hashlib +import io import logging import os import sys
import pycurl
-from .errors import DownloadError, InvalidHashType +from .errors import DownloadError, InvalidHashType, UploadError
class CGILookasideCache(object): @@ -176,3 +177,62 @@ class CGILookasideCache(object):
if not self.file_is_valid(outfile, hash, hashtype=hashtype): raise DownloadError('%s failed checksum' % filename) + + def remote_file_exists(self, name, filename, hash): + """Verify whether a file exists on the lookaside cache + + Args: + name: The name of the module. (usually the name of the SRPM) + filename: The name of the file to check for. + hash: The known good hash of the file. + """ + post_data = [('name', name), + ('%ssum' % self.hashtype, hash), + ('filename', filename)] + + with io.BytesIO() as buf: + c = pycurl.Curl() + c.setopt(pycurl.URL, self.upload_url) + c.setopt(pycurl.WRITEFUNCTION, buf.write) + c.setopt(pycurl.HTTPPOST, post_data) + + if self.client_cert is not None: + if os.path.exists(self.client_cert): + c.setopt(pycurl.SSLCERT, self.client_cert) + else: + self.log.warn("Missing certificate: %s" + % self.client_cert) + + if self.ca_cert is not None: + if os.path.exists(self.ca_cert): + c.setopt(pycurl.CAINFO, self.ca_cert) + else: + self.log.warn("Missing certificate: %s" % self.ca_cert) + + try: + c.perform() + status = c.getinfo(pycurl.RESPONSE_CODE) + + except Exception as e: + raise UploadError(e) + + finally: + c.close() + + output = buf.getvalue().strip() + + if status != 200: + raise UploadError(output) + + # Lookaside CGI script returns these strings depending on whether + # or not the file exists: + if output == b'Available': + return True + + if output == b'Missing': + return False + + # Something unexpected happened + self.log.debug(output) + raise UploadError('Error checking for %s at %s' + % (filename, self.upload_url)) diff --git a/test/test_lookaside.py b/test/test_lookaside.py index 4d987c1..b6cc71d 100644 --- a/test/test_lookaside.py +++ b/test/test_lookaside.py @@ -21,7 +21,7 @@ old_path = list(sys.path) src_path = os.path.join(os.path.dirname(os.path.abspath(__file__)), '../src') sys.path.insert(0, src_path) from pyrpkg.lookaside import CGILookasideCache -from pyrpkg.errors import DownloadError, InvalidHashType +from pyrpkg.errors import DownloadError, InvalidHashType, UploadError sys.path = old_path
@@ -309,3 +309,148 @@ class CGILookasideCacheTestCase(unittest.TestCase): lc = CGILookasideCache('_', '_', '_') lc.print_progress(0.0, 0.0, 0.0, 0.0) self.assertEqual(len(written_lines), 0) + + @mock.patch('pyrpkg.lookaside.pycurl.Curl') + def test_remote_file_exists(self, mock_curl): + def mock_perform(): + curlopts[pycurl.WRITEFUNCTION](b'Available') + + def mock_setopt(opt, value): + curlopts[opt] = value + + curlopts = {} + curl = mock_curl.return_value + curl.getinfo.return_value = 200 + curl.perform.side_effect = mock_perform + curl.setopt.side_effect = mock_setopt + + lc = CGILookasideCache('_', '_', '_') + exists = lc.remote_file_exists('pyrpkg', 'pyrpkg-0.tar.xz', 'thehash') + self.assertTrue(exists) + + @mock.patch('pyrpkg.lookaside.pycurl.Curl') + def test_remote_file_does_not_exist(self, mock_curl): + def mock_perform(): + curlopts[pycurl.WRITEFUNCTION](b'Missing') + + def mock_setopt(opt, value): + curlopts[opt] = value + + curlopts = {} + curl = mock_curl.return_value + curl.getinfo.return_value = 200 + curl.perform.side_effect = mock_perform + curl.setopt.side_effect = mock_setopt + + lc = CGILookasideCache('_', '_', '_') + exists = lc.remote_file_exists('pyrpkg', 'pyrpkg-0.tar.xz', 'thehash') + self.assertFalse(exists) + + @mock.patch('pyrpkg.lookaside.pycurl.Curl') + def test_remote_file_exists_with_custom_certs(self, mock_curl): + def mock_perform(): + curlopts[pycurl.WRITEFUNCTION](b'Available') + + def mock_setopt(opt, value): + curlopts[opt] = value + + curlopts = {} + curl = mock_curl.return_value + curl.getinfo.return_value = 200 + curl.perform.side_effect = mock_perform + curl.setopt.side_effect = mock_setopt + + client_cert = os.path.join(self.workdir, 'my-client-cert.cert') + with open(client_cert, 'w'): + pass + + ca_cert = os.path.join(self.workdir, 'my-custom-cacert.cert') + with open(ca_cert, 'w'): + pass + + lc = CGILookasideCache('_', '_', '_', client_cert=client_cert, + ca_cert=ca_cert) + lc.remote_file_exists('pyrpkg', 'pyrpkg-0.tar.xz', 'thehash') + self.assertEqual(curlopts[pycurl.SSLCERT], client_cert) + self.assertEqual(curlopts[pycurl.CAINFO], ca_cert) + + @mock.patch('pyrpkg.lookaside.logging.getLogger') + @mock.patch('pyrpkg.lookaside.pycurl.Curl') + def test_remote_file_exists_missing_custom_certs(self, mock_curl, + mock_logger): + def mock_perform(): + curlopts[pycurl.WRITEFUNCTION](b'Available') + + def mock_setopt(opt, value): + curlopts[opt] = value + + def mock_warn(msg): + warn_messages.append(msg) + + curlopts = {} + curl = mock_curl.return_value + curl.getinfo.return_value = 200 + curl.perform.side_effect = mock_perform + curl.setopt.side_effect = mock_setopt + + warn_messages = [] + log = mock_logger.return_value + log.warn.side_effect = mock_warn + + client_cert = os.path.join(self.workdir, 'my-client-cert.cert') + ca_cert = os.path.join(self.workdir, 'my-custom-cacert.cert') + + lc = CGILookasideCache('_', '_', '_', client_cert=client_cert, + ca_cert=ca_cert) + lc.remote_file_exists('pyrpkg', 'pyrpkg-0.tar.xz', 'thehash') + self.assertTrue(pycurl.SSLCERT not in curlopts) + self.assertTrue(pycurl.CAINFO not in curlopts) + self.assertEqual(len(warn_messages), 2) + self.assertTrue('Missing certificate: ' in warn_messages[0]) + self.assertTrue('Missing certificate: ' in warn_messages[1]) + + @mock.patch('pyrpkg.lookaside.pycurl.Curl') + def test_remote_file_exists_check_failed(self, mock_curl): + curl = mock_curl.return_value + curl.perform.side_effect = Exception( + 'Could not resolve host: example.com') + + lc = CGILookasideCache('_', '_', '_') + self.assertRaises(UploadError, lc.remote_file_exists, 'pyrpkg', + 'pyrpkg-0.tar.xz', 'thehash') + + @mock.patch('pyrpkg.lookaside.pycurl.Curl') + def test_remote_file_exists_check_failed_status_code(self, mock_curl): + def mock_perform(): + curlopts[pycurl.WRITEFUNCTION](b'Available') + + def mock_setopt(opt, value): + curlopts[opt] = value + + curlopts = {} + curl = mock_curl.return_value + curl.getinfo.return_value = 500 + curl.perform.side_effect = mock_perform + curl.setopt.side_effect = mock_setopt + + lc = CGILookasideCache('_', '_', '_') + self.assertRaises(UploadError, lc.remote_file_exists, 'pyrpkg', + 'pyrpkg-0.0.tar.xz', 'thehash') + + @mock.patch('pyrpkg.lookaside.pycurl.Curl') + def test_remote_file_exists_check_unexpected_error(self, mock_curl): + def mock_perform(): + curlopts[pycurl.WRITEFUNCTION]('Something unexpected') + + def mock_setopt(opt, value): + curlopts[opt] = value + + curlopts = {} + curl = mock_curl.return_value + curl.getinfo.return_value = 200 + curl.perform.side_effect = mock_perform + curl.setopt.side_effect = mock_setopt + + lc = CGILookasideCache('_', '_', '_') + self.assertRaises(UploadError, lc.remote_file_exists, 'pyrpkg', + 'pyrpkg-0.tar.xz', 'thehash')
On Wed, May 06, 2015 at 01:53:12PM +0200, Mathieu Bridon wrote:
if self.client_cert is not None:
if os.path.exists(self.client_cert):
c.setopt(pycurl.SSLCERT, self.client_cert)
else:
self.log.warn("Missing certificate: %s"
% self.client_cert)
if self.ca_cert is not None:
if os.path.exists(self.ca_cert):
c.setopt(pycurl.CAINFO, self.ca_cert)
else:
self.log.warn("Missing certificate: %s" % self.ca_cert)
Hm, I think that these two ``log.warn`` statements should probably be raised exceptions.
If I'm reading this right, in the previous commit you provide an interface for users of the library to override properties to define their own client certs and ca certs. If they use that and specify those paths explicitly, but they don't exist on disk, that warrants more than just a warning (which might get swallowed up by someone's incorrectly configured logging somewhere). An outright exception that stops execution is due.
On Wed, 2015-05-06 at 10:23 -0400, Ralph Bean wrote:
On Wed, May 06, 2015 at 01:53:12PM +0200, Mathieu Bridon wrote:
if self.client_cert is not None:
if os.path.exists(self.client_cert):
c.setopt(pycurl.SSLCERT, self.client_cert)
else:
self.log.warn("Missing certificate: %s"
% self.client_cert)
if self.ca_cert is not None:
if os.path.exists(self.ca_cert):
c.setopt(pycurl.CAINFO, self.ca_cert)
else:
self.log.warn("Missing certificate: %s" % self.ca_cert)
Hm, I think that these two ``log.warn`` statements should probably be raised exceptions.
If I'm reading this right, in the previous commit you provide an interface for users of the library to override properties to define their own client certs and ca certs. If they use that and specify those paths explicitly, but they don't exist on disk, that warrants more than just a warning (which might get swallowed up by someone's incorrectly configured logging somewhere). An outright exception that stops execution is due.
I'd agree with you in absolute, but the patch series is already pretty big, so I preferred avoiding to sneak in some behavioural changes like this.
fedpkg currently only logs a warning, so I kept the same here.
We could change this of course, but I thought it would be better to change it in a followup patch, rather than in the middle of this big series.
If there's consensus on the list to raise exceptions here, I'll send a patch 22/21. :)
From: Mathieu Bridon bochecha@daitauha.fr
--- src/pyrpkg/__init__.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/pyrpkg/__init__.py b/src/pyrpkg/__init__.py index 14eb808..bb0ff3f 100644 --- a/src/pyrpkg/__init__.py +++ b/src/pyrpkg/__init__.py @@ -2204,7 +2204,6 @@ class Commands(object): for f in files: # TODO: Skip empty file needed? file_hash = self.lookasidecache.hash_file(f) - self.log.info("Uploading: %s %s" % (file_hash, f)) file_basename = os.path.basename(f)
try: @@ -2234,6 +2233,7 @@ class Commands(object): self.log.info("File already uploaded: %s" % file_basename)
else: + self.log.info("Uploading: %s %s" % (file_hash, f)) # Ensure the new file is readable: os.chmod(f, stat.S_IRUSR | stat.S_IRGRP | stat.S_IROTH) self._do_curl(file_hash, f)
From: Mathieu Bridon bochecha@daitauha.fr
Each of these actions is entirely obvious to anybody reading the source code, so these 3 comments are nothing but noise, in a file which is already way too long. --- src/pyrpkg/__init__.py | 5 ----- 1 file changed, 5 deletions(-)
diff --git a/src/pyrpkg/__init__.py b/src/pyrpkg/__init__.py index bb0ff3f..23b67a7 100644 --- a/src/pyrpkg/__init__.py +++ b/src/pyrpkg/__init__.py @@ -2196,8 +2196,6 @@ class Commands(object):
sourcesf = SourcesFile(self.sources_filename, self.source_entry_type, replace=replace) - - # Will add new sources to .gitignore if they are not already there. gitignore = GitIgnore(os.path.join(self.path, '.gitignore'))
uploaded = [] @@ -2229,7 +2227,6 @@ class Commands(object):
if self.lookasidecache.remote_file_exists( self.module_name, file_basename, file_hash): - # Already uploaded, skip it: self.log.info("File already uploaded: %s" % file_basename)
else: @@ -2240,8 +2237,6 @@ class Commands(object): uploaded.append(file_basename)
sourcesf.write() - - # Write .gitignore with the new sources if anything changed: gitignore.write()
self.repo.index.add(['sources', '.gitignore'])
From: Mathieu Bridon bochecha@daitauha.fr
These lines are trying to ensure that we can read the file content, and being able to read it is of course necessary if we're uploading it.
However, a few lines above, we actually computed the hash of the file, which... required being able to read it.
So if the file wasn't readable, we'd have failed much earlier anyway, which makes those two lines useless.
In addition, they make the file read-only, which is annoying for package maintainers when try to remove it, as they then require a confirmation. --- src/pyrpkg/__init__.py | 3 --- 1 file changed, 3 deletions(-)
diff --git a/src/pyrpkg/__init__.py b/src/pyrpkg/__init__.py index 23b67a7..a8ca735 100644 --- a/src/pyrpkg/__init__.py +++ b/src/pyrpkg/__init__.py @@ -25,7 +25,6 @@ import rpm import logging import git import ConfigParser -import stat import tempfile import fnmatch import urlparse @@ -2231,8 +2230,6 @@ class Commands(object):
else: self.log.info("Uploading: %s %s" % (file_hash, f)) - # Ensure the new file is readable: - os.chmod(f, stat.S_IRUSR | stat.S_IRGRP | stat.S_IROTH) self._do_curl(file_hash, f) uploaded.append(file_basename)
From: Mathieu Bridon bochecha@daitauha.fr
We are already logging each file we are uploading, so there really is no need to add such a summary at the end.
Removing it also makes the code simpler, as we don't need to keep track of the list of uploaded files any more. --- src/pyrpkg/__init__.py | 6 ------ 1 file changed, 6 deletions(-)
diff --git a/src/pyrpkg/__init__.py b/src/pyrpkg/__init__.py index a8ca735..391085c 100644 --- a/src/pyrpkg/__init__.py +++ b/src/pyrpkg/__init__.py @@ -2197,7 +2197,6 @@ class Commands(object): replace=replace) gitignore = GitIgnore(os.path.join(self.path, '.gitignore'))
- uploaded = [] for f in files: # TODO: Skip empty file needed? file_hash = self.lookasidecache.hash_file(f) @@ -2231,7 +2230,6 @@ class Commands(object): else: self.log.info("Uploading: %s %s" % (file_hash, f)) self._do_curl(file_hash, f) - uploaded.append(file_basename)
sourcesf.write() gitignore.write() @@ -2241,10 +2239,6 @@ class Commands(object): # Change back to original working dir: os.chdir(oldpath)
- # Log some info - self.log.info('Uploaded and added to .gitignore: %s' % - ' '.join(uploaded)) - def prep(self, arch=None, builddir=None): """Run rpm -bp on a module
From: Mathieu Bridon bochecha@daitauha.fr
This is the last piece of code related to the lookaside cache interaction which needed to be moved.
In addition to just moving the code, we now better handle errors, we exclusively use pycurl instead of calling the curl command, and this whole thing is completely covered by unit tests.
It also automatically handles usage of client-side and CA certificates, to make it easier for downstreams like fedpkg.
The Commands._do_curl method is now completely unused, and is marked as deprecated. However its implementation code is maintained, in case downstreams were actually using it. --- src/pyrpkg/__init__.py | 11 ++-- src/pyrpkg/lookaside.py | 61 ++++++++++++++++++++++ test/test_lookaside.py | 135 ++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 199 insertions(+), 8 deletions(-)
diff --git a/src/pyrpkg/__init__.py b/src/pyrpkg/__init__.py index 391085c..d366395 100644 --- a/src/pyrpkg/__init__.py +++ b/src/pyrpkg/__init__.py @@ -914,7 +914,8 @@ class Commands(object): return os.path.getmtime(file1) > os.path.getmtime(file2)
def _do_curl(self, file_hash, file): - """Use curl manually to upload a file""" + warn_deprecated(self.__class__.__name__, '_do_curl', + 'lookasidecache.upload')
cmd = ['curl', '--fail', '-o', '/dev/null', '--show-error', '--progress-bar', '-F', 'name=%s' % self.module_name, @@ -2223,13 +2224,7 @@ class Commands(object): if not gitignore.match(file_basename): gitignore.add('/%s' % file_basename)
- if self.lookasidecache.remote_file_exists( - self.module_name, file_basename, file_hash): - self.log.info("File already uploaded: %s" % file_basename) - - else: - self.log.info("Uploading: %s %s" % (file_hash, f)) - self._do_curl(file_hash, f) + self.lookasidecache.upload(self.module_name, f, file_hash)
sourcesf.write() gitignore.write() diff --git a/src/pyrpkg/lookaside.py b/src/pyrpkg/lookaside.py index 240bd6a..6bf5c42 100644 --- a/src/pyrpkg/lookaside.py +++ b/src/pyrpkg/lookaside.py @@ -236,3 +236,64 @@ class CGILookasideCache(object): self.log.debug(output) raise UploadError('Error checking for %s at %s' % (filename, self.upload_url)) + + def upload(self, name, filepath, hash): + """Upload a source file + + Args: + name (str): The name of the module. (usually the name of the SRPM) + filepath (str): The full path to the file to upload. + hash (str): The known good hash of the file. + """ + filename = os.path.basename(filepath) + + if self.remote_file_exists(name, filename, hash): + self.log.info("File already uploaded: %s" % filepath) + return + + self.log.info("Uploading: %s" % filepath) + post_data = [('name', name), + ('%ssum' % self.hashtype, hash), + ('file', (pycurl.FORM_FILE, filepath))] + + with io.BytesIO() as buf: + c = pycurl.Curl() + c.setopt(pycurl.URL, self.upload_url) + c.setopt(pycurl.NOPROGRESS, False) + c.setopt(pycurl.PROGRESSFUNCTION, self.print_progress) + c.setopt(pycurl.WRITEFUNCTION, buf.write) + c.setopt(pycurl.HTTPPOST, post_data) + + if self.client_cert is not None: + if os.path.exists(self.client_cert): + c.setopt(pycurl.SSLCERT, self.client_cert) + else: + self.log.warn("Missing certificate: %s" % self.client_cert) + + if self.ca_cert is not None: + if os.path.exists(self.ca_cert): + c.setopt(pycurl.CAINFO, self.ca_cert) + else: + self.log.warn("Missing certificate: %s" % self.ca_cert) + + try: + c.perform() + status = c.getinfo(pycurl.RESPONSE_CODE) + + except Exception as e: + raise UploadError(e) + + finally: + c.close() + + output = buf.getvalue().strip() + + # Get back a new line, after displaying the download progress + sys.stdout.write('\n') + sys.stdout.flush() + + if status != 200: + raise UploadError(output) + + if output: + self.log.debug(output) diff --git a/test/test_lookaside.py b/test/test_lookaside.py index b6cc71d..eced096 100644 --- a/test/test_lookaside.py +++ b/test/test_lookaside.py @@ -454,3 +454,138 @@ class CGILookasideCacheTestCase(unittest.TestCase): lc = CGILookasideCache('_', '_', '_') self.assertRaises(UploadError, lc.remote_file_exists, 'pyrpkg', 'pyrpkg-0.tar.xz', 'thehash') + + @mock.patch('pyrpkg.lookaside.logging.getLogger') + @mock.patch('pyrpkg.lookaside.pycurl.Curl') + def test_upload(self, mock_curl, mock_logger): + def mock_setopt(opt, value): + curlopts[opt] = value + + def mock_perform(): + curlopts[pycurl.WRITEFUNCTION](b'Some output') + + def mock_debug(msg): + debug_messages.append(msg) + + curlopts = {} + curl = mock_curl.return_value + curl.getinfo.return_value = 200 + curl.perform.side_effect = mock_perform + curl.setopt.side_effect = mock_setopt + + debug_messages = [] + log = mock_logger.return_value + log.debug.side_effect = mock_debug + + lc = CGILookasideCache('sha512', '_', '_') + + with mock.patch.object(lc, 'remote_file_exists', lambda *x: False): + lc.upload('pyrpkg', 'pyrpkg-0.0.tar.xz', 'thehash') + + self.assertTrue(pycurl.HTTPPOST in curlopts) + self.assertEqual(curlopts[pycurl.HTTPPOST], [ + ('name', 'pyrpkg'), ('sha512sum', 'thehash'), + ('file', (pycurl.FORM_FILE, 'pyrpkg-0.0.tar.xz'))]) + + self.assertEqual(debug_messages, [b'Some output']) + + @mock.patch('pyrpkg.lookaside.pycurl.Curl') + def test_upload_already_exists(self, mock_curl): + curl = mock_curl.return_value + + lc = CGILookasideCache('_', '_', '_') + hash = 'thehash' + + with mock.patch.object(lc, 'remote_file_exists', lambda *x: True): + lc.upload('pyrpkg', 'pyrpkg-0.0.tar.xz', hash) + + self.assertEqual(curl.perform.call_count, 0) + self.assertEqual(curl.setopt.call_count, 0) + + @mock.patch('pyrpkg.lookaside.pycurl.Curl') + def test_upload_with_custom_certs(self, mock_curl): + def mock_setopt(opt, value): + curlopts[opt] = value + + curlopts = {} + curl = mock_curl.return_value + curl.getinfo.return_value = 200 + curl.setopt.side_effect = mock_setopt + + client_cert = os.path.join(self.workdir, 'my-client-cert.cert') + with open(client_cert, 'w'): + pass + + ca_cert = os.path.join(self.workdir, 'my-custom-cacert.cert') + with open(ca_cert, 'w'): + pass + + lc = CGILookasideCache('_', '_', '_', client_cert=client_cert, + ca_cert=ca_cert) + + with mock.patch.object(lc, 'remote_file_exists', lambda *x: False): + lc.upload('pyrpkg', 'pyrpkg-0.0.tar.xz', 'thehash') + + self.assertEqual(curlopts[pycurl.SSLCERT], client_cert) + self.assertEqual(curlopts[pycurl.CAINFO], ca_cert) + + @mock.patch('pyrpkg.lookaside.logging.getLogger') + @mock.patch('pyrpkg.lookaside.pycurl.Curl') + def test_upload_missing_custom_certs(self, mock_curl, mock_logger): + def mock_setopt(opt, value): + curlopts[opt] = value + + def mock_warn(msg): + warn_messages.append(msg) + + curlopts = {} + curl = mock_curl.return_value + curl.getinfo.return_value = 200 + curl.setopt.side_effect = mock_setopt + + warn_messages = [] + log = mock_logger.return_value + log.warn.side_effect = mock_warn + + client_cert = os.path.join(self.workdir, 'my-client-cert.cert') + ca_cert = os.path.join(self.workdir, 'my-custom-cacert.cert') + + lc = CGILookasideCache('_', '_', '_', client_cert=client_cert, + ca_cert=ca_cert) + + with mock.patch.object(lc, 'remote_file_exists', lambda *x: False): + lc.upload('pyrpkg', 'pyrpkg-0.tar.xz', 'thehash') + + self.assertTrue(pycurl.SSLCERT not in curlopts) + self.assertTrue(pycurl.CAINFO not in curlopts) + self.assertEqual(len(warn_messages), 2) + self.assertTrue('Missing certificate: ' in warn_messages[0]) + self.assertTrue('Missing certificate: ' in warn_messages[1]) + + @mock.patch('pyrpkg.lookaside.pycurl.Curl') + def test_upload_failed(self, mock_curl): + curl = mock_curl.return_value + curl.perform.side_effect = Exception( + 'Could not resolve host: example.com') + + lc = CGILookasideCache('_', '_', '_') + + with mock.patch.object(lc, 'remote_file_exists', lambda *x: False): + self.assertRaises(UploadError, lc.upload, 'pyrpkg', + 'pyrpkg-0.tar.xz', 'thehash') + + @mock.patch('pyrpkg.lookaside.pycurl.Curl') + def test_upload_failed_status_code(self, mock_curl): + def mock_setopt(opt, value): + curlopts[opt] = value + + curlopts = {} + curl = mock_curl.return_value + curl.getinfo.return_value = 500 + curl.setopt.side_effect = mock_setopt + + lc = CGILookasideCache('sha512', '_', '_') + + with mock.patch.object(lc, 'remote_file_exists', lambda *x: False): + self.assertRaises(UploadError, lc.upload, 'pyrpkg', + 'pyrpkg-0.tar.xz', 'thehash')
On Wed, May 06, 2015 at 01:53:17PM +0200, Mathieu Bridon wrote:
From: Mathieu Bridon bochecha@daitauha.fr
if self.client_cert is not None:
if os.path.exists(self.client_cert):
c.setopt(pycurl.SSLCERT, self.client_cert)
else:
self.log.warn("Missing certificate: %s" % self.client_cert)
if self.ca_cert is not None:
if os.path.exists(self.ca_cert):
c.setopt(pycurl.CAINFO, self.ca_cert)
else:
self.log.warn("Missing certificate: %s" % self.ca_cert)
Same issue here as in my other comment.
On Wed, May 06, 2015 at 01:52:56PM +0200, Mathieu Bridon wrote:
This patch series moves the code which talks to the remote lookaside cache, rewriting parts of it. This makes the code more maintainable, and provides a nicer, easier to reuse and extend API.
At the same time, some features are added which could allow some downstreams (fedpkg for example, but also probably centpkg) to remove some of their code completely, without losing any functionality.
Speaking of downstream users of the pyrpkg API, this patch series does not break the API, downstreams don't actually require any code change to keep working as they currently do.
All of this is thoroughly tested, the newly introduced modules are fully covered by unit tests. As a result, global test coverage goes from 26% to 36%.
In order to make this more maintainable and better tested, the code now does not call the 'curl' command as a subprocess any more. Instead, everything uses pycurl, which was already used in pyrpkg and as such no new runtime dependency is introduced.
Some code paths are marked as deprecated, and will emit messages when they are executed, so that downstream users of the pyrpkg API can learn about the deprecations, and port their code. That code could then be removed from pyrpkg eventually, but this patch series does not do that.
The new APIs are entirely documented. Try this for example:
from pyrpkg.lookaside import CGILookasideCache help(CGILookasideCache)
As a cherry on the cake, the new modules (and their unit tests) are fully compatible with Python 3, so we'll have less porting work when the time comes. Of course, the code still is compatible with Python 2.6, so things should still work on EL6.
Finally, these changes are going to help immensely when we try to move away from MD5 checksums for the lookaside cache in Fedora.
All in all, this is a lot of changes:
src/pyrpkg/__init__.py | 263 +++++++------------- src/pyrpkg/cli.py | 8 +- src/pyrpkg/errors.py | 49 ++++ src/pyrpkg/lookaside.py | 299 ++++++++++++++++++++++ src/pyrpkg/sources.py | 14 +- src/pyrpkg/utils.py | 63 +++++ test/test_lookaside.py | 591 ++++++++++++++++++++++++++++++++++++++++++++ test/test_utils.py | 159 ++++++++++++ 8 files changed, 1258 insertions(+), 188 deletions(-)
I've made a copr where I've pushed packages built with these changes, and a couple of packager-friends have been using them for a few days already. They are heavy-users of pyrpkg (through fedpkg, rhpkg and rdopkg) and all the problems they could find are fixed in this patch series.
All in all, this is a very good, high-quality change-set.
rel-eng@lists.fedoraproject.org