701506 - [tracker] create python package webserver

Reporter

Description

•

13 years ago

Tarballs-on-a-webserver should be fine.
This needs to have an internal IP so the build/test farms can reach it. If it has another NIC or interface that's external, that's fine and would be more community-friendly.

This will allow us to

a) share our own homegrown python packages, and
b) allow our build/test farms to install python packages without hitting the internet.

This should be very simple, but this will likely become a key piece of infrastructure, so should be designed with anti-SPOF concepts in mind.

Aki Sasaki (not active)

Reporter

Comment 1

•

13 years ago

Allowing LDAP-authed people access to upload new python packages would be best, so as not to create another releng bottleneck.

Aki Sasaki (not active)

Reporter

Comment 2

•

13 years ago

There were also rumblings of putting rpms on here.

Jeff Hammel

Comment 3

•

13 years ago

I've done this before and outlined a few steps here:

http://k0s.org/portfolio/pypi.html

You can do this with a static fileserver, outside of the uploading package parts.  If we have a POST handler to do this, we could maybe upload a tarball or just point to a repo.  In either case, the post-handler should examine the package to ensure that the package name, etc is right.

FWIW, I'd be happy to do a POC here.

Aki Sasaki (not active)

Reporter

Comment 4

•

13 years ago

Sure.

I think we should partition the webroot into three at first blush:

python/
rpms/
random_crap/ # talos profile zipfiles?

But that's probably not a blocker for a POC.

Jeff Hammel

Comment 5

•

13 years ago

I am most interested in creating a python package index, which could be mounted under python wrt comment 4 above.

Jeff Hammel

Comment 6

•

13 years ago

Sorry, hit submit too soon.  Since this would be a presumedly WSGI application, its mounting would not affect the other PATH_INFO dispatched apps

Jeff Hammel

Comment 7

•

13 years ago

See 

* http://peak.telecommunity.com/DevCenter/EasyInstall#package-index-api
* http://wiki.python.org/moin/CheeseShopDev

Were I implementing I would create a simple view with webob rather than using CheeseShop, but of course this depends on what is desired.

Jeff Hammel

Comment 8

•

13 years ago

Talked to tarek.  He gave me some more pointers to things that already exist:

* http://pypi.python.org/pypi/pep381client is used for the services internal pypi; however it cannot mirror subsets of pypi just the whole thing
* http://pypi.python.org/pypi/EggBasket is another project
* http://pypi.python.org/pypi/z3c.pypimirror is similar to pep381client but you can filter what is mirrored

Jeff Hammel

Updated

•

13 years ago

Whiteboard: [automation] → [automation][mozbase]

cmtalbert

Comment 9

•

13 years ago

So guys, what's the final plan here?

Jeff Hammel

Comment 10

•

13 years ago

I can spec out a solution if desired but I am ill-equipped to guess what is appropriate for releng infrastructure. If I were going to implement something, it would be a simple webob WSGI app with essentially only a POST handler (for adding new eggs from .tar.gz sources, or optionally URLs, but if it is behind the build VPN the latter is probably not useful if it can't reach the outside world) and a static fileserver in middleware that would be paste's static URL parser as a passthrough fileserver (see e.g. http://k0s.org/hg/genshi_view/file/807c8eef8098/genshi_view/template/%2Bpackage%2B/factory.py_tmpl#l8 ).  There could be some templates involved, e.g. for the '/' view, but mostly just the POST handler, which would 
1. unpack the .tar.gz file.  
2. run setup.py e.g. egg_info and/or sdist to create the correct tarball
3. put it in the appropriate place in the served directory structure
I would choose this vs just keying off of the filename uploaded as it would prevent human error and (e.g.) http://hg.mozilla.org/build/talos/archive/tip.tar.gz doesn't contain any information about the package.

However,
A) This is a bit of reinventing the wheel.  I have listed several links in the above comments that could be assessed for their approrpriateness.  However my roll your own approach requires minimal setup, is very testable, and could be altered to serve further needs/workflow down the line.  OTOH, if something else works better off the shelf and/or was more palatable to releng maintenance needs, then using it would probably be better.
B) As said, I can't guess what is desired for the releng infrastructure.

I think we need more precise goals and ownership of this ticket.  I am more than happy to write the software I described above, but would like more of a nod that that is where we want to go.  If we're going to use off the shelf software, then its probably better that someone who knows the infrastructure better to investigate.

Chris AtLee [:catlee]

Comment 11

•

13 years ago

What's wrong with tarballs-on-a-webserver again?

Jeff Hammel

Comment 12

•

13 years ago

(In reply to Chris AtLee [:catlee] from comment #11)
> What's wrong with tarballs-on-a-webserver again?

That is fine by me.  How would stakeholders (like the A*Team) notify releng that a particular set of packages would be updated? 

(TL;DR that's pretty much what i meant with my solution to comment 10 with the POST handler taking care of the updating part)

Jeff Hammel

Comment 13

•

13 years ago

It would also be nice to be able to see what packages + versions do exist without having to get access to the build VPN (nice to have, not mandatory, but would likely obviate headaches)

Jeff Hammel

Updated

•

13 years ago

Blocks: 713055

Jeff Hammel

Updated

•

13 years ago

Priority: P3 → P1

Whiteboard: [automation][mozbase] → [automation][mozbase][mozharness+talos]

cmtalbert

Comment 14

•

13 years ago

This is going to be required to put Talos atop mozbase/mozharness.  So, we need to accomplish it in 2012 Q1 so that we are set up to complete Talos mozbase/mozharness in Q2.  I have my guys ready to do the work, but we need very clear requirements from y'all (Releng) so that we build something we are both happy with.

Parsing the comments up above, I see the following requirements (Please comment and correct as needed).
* Simple simple simple.  It has to be brain dead simple.
* Needs to be accessible from both Releng and A-team worlds so that both staging systems, developers, and build infrastructure can use it. (i.e. dual homed, on build VPN and an internal network)
* Needs to work completely independently of any external resource (i.e. no links to pypi.python.org)
* We need a way to store tarballs on a server so that python's setup.py scripts can download them (these tarballs are python packages)
* We need a way to view what versions of packages exist on the server without ssh'ing into it.
* We need a way to upload new packages and new package versions to the server without ssh'ing into it.
** System needs to keep multiple versions of packages accessible to its clients
* We need a way to remove a package version from the system.  Perhaps removal *should* require ssh access? This is a dangerous operation and could burn a tree, so I'm happy if this requires buildvpn+ssh access.
* Does it need to store more than python packages?  If so, should a separate web application control those or should the same web-app control both the python packages as well as the other items on the server?
** If we go with one of the "standard" projects from comment 8 they will be ill-equipped to monitor anything that isn't a python package.  So this question is key to understand if we need to build our own or if we can use something off the shelf.
** However a simple web-front end to a file server is really quite simple, it would just need to be able to report version strings and/or checksums for the objects in its file system.

= P2 =
* A UI for this system.  I think that if the first version of this simply produces something with a REST api, then that will be sufficient for our needs.  Of course once we have that api, a UI is pretty trivial.

= Deliverables =
* Server system that meets requirements above (once we agree on those)
* One-file python module that talks to the server so that it is easy to incorporate this into mozharness/mozbase code as needed.  Note, this client may not be needed if we create something so that the setuptools setup.py scripts already understand how to communicate with the server.  However, if the server manages more than just python packages, this client will be needed and should be provided.
* A test client/server instance that runs through all available commands and ensures all requirements met (for example, ensures no package can be deleted from REST API)

= What we need from you =
I'd like *everyone* interested to comment on the above sections so that we can understand what we need to build.  Once we have comments from y'all, I'll re-summarize the requirements and we will sign off on them and go build this thing.

Justin Wood (:Callek)

Comment 15

•

13 years ago

(In reply to Clint Talbert ( :ctalbert ) from comment #14)
> * Needs to be accessible from both Releng and A-team worlds so that both
> staging systems, developers, and build infrastructure can use it. (i.e. dual
> homed, on build VPN and an internal network)

Though I don't know of any direct plans in the works that would make it required/necessary for myself to have access, I can imagine situations in the future that use this because it is setup and thus creating a dependancy.

So if possible and not too much work, I would love SeaMonkey Build Infra to have access to this system as well.

(What I don't need is the abil to upload packages myself, do any changes at all, etc.) I don't *think* there is anything planned to be secret on the system. And until something I use depends on it, this *can* be P2/3, I just mention it so it is thought about/planned if at all possible. (or if its easier to do when setup, thats fine too)

Jeff Hammel

Comment 16

•

13 years ago

this looks promising: https://github.com/SurveyMonkey/CheesePrism

Armen [:armenzg]

Comment 17

•

13 years ago

From build.mozilla.org side I only care about the following (which can go under "random_crap"):

[armenzg@dm-wwwbuild01 ~]$ cd /var/www/html/build/talos/tools/
[armenzg@dm-wwwbuild01 talos]$ ls -l
total 16
drwxrwxr-x  2 syncbld build 4096 Jan  9 00:17 profiles
drwxrwsr-x 13 catlee  build 4096 Dec 13 14:27 tools
drwxrwsr-x  2 catlee  build 4096 Sep 21 07:03 xpis
drwxrwsr-x  3 catlee  build 4096 Dec 13 15:23 zips
[armenzg@dm-wwwbuild01 talos]$ ls -l profiles
total 62756
-rw-r--r-- 1 syncbld syncbld  7671833 Jan  9 00:01 dirtyDBs.zip
-rw-r--r-- 1 syncbld syncbld 56509398 Jan  9 00:17 dirtyMaxDBs.zip
[armenzg@dm-wwwbuild01 talos]$ ls -l xpis
total 60
-rw-r--r-- 1 bhearsum build 18454 Sep 19 07:02 pageloader.xpi
-rw-rw-r-- 1 coop     build 17542 May  9  2011 pageloader.xpi.old.20110509
-rw-r--r-- 1 bhearsum build 17347 Sep 21 07:03 pageloader.xpi.old.20110921
lrwxrwxrwx 1 coop     build    14 May  9  2011 pageload.xpi -> pageloader.xpi
[armenzg@dm-wwwbuild01 talos]$ ls -l zips
total 234156
-rw-r--r-- 1 lsblakk build 18109405 Aug 31 13:42 flash32_10_3_183_5.zip
-rw-r--r-- 1 lsblakk build 25440443 Aug 31 13:46 flash64_11_0_d1_98.zip
lrwxrwxrwx 1 asasaki build       27 Dec 13 14:33 mozbase.zip -> mozilla-mozbase-74f5c2a.zip
-rw-r--r-- 1 asasaki users   168148 Dec 13 15:25 mozilla-mozbase-74f5c2a.zip
-rw-r--r-- 1 asasaki users   138173 Dec 13 14:29 mozilla-peptest-56ee00b.zip
drwxrwsr-x 2 armenzg build     4096 Nov 18 06:48 old
-rw-rw-r-- 1 catlee  build 46269210 May 27  2010 pagesets.zip
lrwxrwxrwx 1 asasaki build       27 Dec 13 14:30 peptest.zip -> mozilla-peptest-56ee00b.zip
-rw-rw-r-- 1 catlee  build 11581649 May 27  2010 plugins.zip
-rw-r--r-- 1 armenzg build  6007768 Nov 17 11:31 talos.bug702351.zip
-rw-r--r-- 1 lsblakk build  6010342 Nov 23 17:56 talos.bug705032.zip
lrwxrwxrwx 1 lsblakk build       19 Nov 29 10:55 talos.zip -> talos.bug705032.zip
-rw-rw-r-- 1 coop    build 46269210 May 13  2011 tp4.zip
-rw-r--r-- 1 coop    build 79455538 May 20  2011 tp5.zip

Jeff Hammel

Comment 19

•

12 years ago

In working on https://bugzilla.mozilla.org/show_bug.cgi?id=701506 ,
create python package webserver,
I have explored several different solutions.  In short there are
many different python packages available for doing such a thing, none
of which I found completely appropriate to our needs.

My goals for this project are:

- get the build infrastructure able to depend on python packages (vs
  having each package having to be either deployed to each slave or
  having to bundle them a la
  http://hg.mozilla.org/build/talos/file/8197dc094fe3/create_talos_zip.py
  which does not scale)

- being able to declare python dependencies in the conventional way:
  you declare ``install_requires`` as part of the ``setup`` function
  and e.g. ``python setup.py develop`` should grab the dependencies
  and versions it needs from a package index

- avoid hitting outside networks.  Tests should not fail because
  e.g. pypi.python.org is down.  We should not depend on this or any
  other networks being fast or up at all.

- being easy to upload new versions

- being easy to maintain

To this end, I wrote simpypi, as detailed in:
http://pypi.python.org/pypi/simpypi

Since it is likely that whatever is done for a first iteration will
not prove the solution we want to go with down the line, and instead
will be more of a talking point for developing a long-term solution, I
have decided to make the initial version of simpypi as simple as
possible. To this end, I made the following architecture choices:

- The simpypi GET view is just a file upload widget in a static HTML
  form. We may want to add more to this page, such as a listing of
  packages.

- simpypi currently does nothing to get packages from pypi.  Whether
  it should or not depends on our needs.

- there is no authentication.  While the simple index may be served
  within and outside of the build system without security impact, I
  have been laboring under the assumption that file upload will be
  protected by VPN. Other auth methods could also be considered

Other issues are described in the documentation:
http://k0s.org/mozilla/hg/simpypi/file/tip/README.txt

In crafting simpypi, I've realized that the simple index and the
upload mechanisms are actually uncoupled: the former serves a package index
for installation, the latter takes an upload an puts it in an
appropriate place in a directory.  This uncoupling gives significant
flexibility with respect to deployment or development.  For instance,
the simpypi piece can be swapped out as long as the simple index
directory server continues to work.

I initially (and still, to a lesser degree, continue) to investigate
https://github.com/SurveyMonkey/CheesePrism . CheesePrism is a
heavier solution (although, compared to my survey of existing python
package index solutions, it is not that heavy) that centers on taking
packages from pypi.python.org for population.  As best I can tell,
this is not really what we want from a Mozilla package server:  not
all of the packages we want or need are on pypi.python.org and the
workflow proscribed by such a solution is probably undesirable to us.
I initially hoped to add more options to CheesePrism, but bug fixes
and turnarounds have been slow.

You can see active simpypi at http://k0s.org:8080 for the upload page
and http://k0s.org:8080/index/ for the index page.  I have uploaded
mozbase and a few other packages as a demonstration.  If you want to
test, deploy a new virtualenv and run e.g.

    easy_install -i http://k0s.org:8080/index/ mozrunner

Note that the packages come from k0s.org:8080 .

You can see an active CheesePrism instance at http://k0s.org:6543/

Note that this is a POC and is intended as a talking point more than a
final solution.  A basic package index can be realized using tarballs
served with a static fileserver, but we need to have a machine to do
it on.  We should also figure out our network needs:  the package
index must be usable via the build infrastructure, but can also be
publicly available.  The web UI for uploading packages, be it simpypi
or other, should be behind a VPN.  The build infrastructure needs to
drastically begin to change to start installing dependencies from this
system vs. what we do now (which is largely work around the lack of a
package index).

We need to figure out what we want to do and drive this effort
forward.  I can work on deployment issues if we come up with a system
that I am comfortable administrating and have a computer to put it
on, though I'm not necessarily ideal for the job.  The web UI for
uploading packages should be worked through -- I give the simplest
possible model, though it can no doubt be improved.  That said, the
web UI is not necessary for serving packages now, though a computer
and a static fileserver is.

Jeff Hammel

Comment 20

•

12 years ago

So this bug is contingent on getting a machine setup to make this happen.  We'll have the simpypi front end for uploading packages behind a VPN (but otherwise open, though we consider whether this is enough) and a fileserver, presumedly apache or nginx, serving the packages and directory index both to build machines and the outside world.  A key point to take away is that these two components don't really talk to each other at all, so whatever we end up doing for a package uploader should not block the adoption of this package index.

Jeff Hammel

Comment 21

•

12 years ago

[I will refrain from posting links to internal URLs herein for the time being as it seems silly to reference non-public information in a public bug.  We can make this a security bug or what not if we want to discuss internal practices]

In tackling this bug I've come to believe that we need to spec out what is wanted as an initial deployment.  My initial plan was to put the initial set of packages we need in pypi format and serve as static files. Since we only need a small subset of packages, assuming some A*Team and/or releng maintainer could modify the package index, this should be sufficient to get us off the ground. To this end, I have put the packages necessary to run Talos on a development "server". However, since the automation and tools team wants this to be managed by IT, we should figure out the IT contact here and figure out:
- what work is to be done by IT?
- what access ateam/releng contacts will have initially to this box?

If we aren't going to be able to e.g. scp files to a directory in where they are served, we'll need to rethink this a bit.

Serving the files themselves is just serving static files; the question is how are we going to maintain the index: that is add packages, update packages, remove packages?

This blocks mozharness+talos in production, jetperf, in production, and a lot of work we want to do for testing in production in general.

Jonathan Griffin (:jgriffin)

Comment 22

•

12 years ago

> 
> Serving the files themselves is just serving static files; the question is
> how are we going to maintain the index: that is add packages, update
> packages, remove packages?
> 
> This blocks mozharness+talos in production, jetperf, in production, and a
> lot of work we want to do for testing in production in general.

My first thought is to put the packages and the index in a separate repo maintained just for this purpose, and then have some process on the server which pulls the repo periodically.  Or, if IT doesn't like automatically pulling from a repo, we could file IT bugs to update the repo to a certain commit.

This would also allow us to easily share the static content in the development, staging, and production servers.

This would allow us to update the package index without needing direct access to the production environment.

Jeff Hammel

Comment 23

•

12 years ago

I have mixed emotions on using a repo for this.  The packages will be binary data, so will this be too painful to deal with?  I imagine we're going to start with 10s of megs here and that will probably quickly become 100s of megs.  Since we're storing all history.... :/

On the other hand, this would solve the problem now, assuming IT is okay with it.

In any case, we should probably get whoever is going to maintain this for IT to get their buy in.

Carl Meyer (:carljm)

Comment 24

•

12 years ago

FWIW I've used git to store Python sdist tarballs on several projects (including MozTrap) over a period of a year or more, and haven't had any practical troubles yet with repository size. Code just doesn't tend to be that big. I haven't dealt with any individual packages larger than ~6MB; dunno what size of packages will be stored on this server.

If the size of the repo history grows over time to a point where it's no longer manageable, there are several techniques available to rebase unneeded history away and regain the space.

Built-in robust change-tracking and rollback is a pretty nice advantage of the git-repo approach compared to ad-hoc file-upload web UIs.

Jeff Hammel

Comment 25

•

12 years ago

So it sounds like a few things:

1. Getting things in some repository is a good solution for both IT (the would-be maintainers) and the ateam, releng, and other stakeholders.  This solves the auth problem (since you will need auth to push), there doesn't need to be any fancy web interface just a static fileserver (and directory index server, let's be explicit), and it should be fully scriptable for automation on both sides.
1.a. What type of vcs we need is really up to IT.  Since these are binary files, that is some consideration.  That said, we will be mostly (ideally, exclusively, but that is an ideal) be adding new package versions so we will (again, for me most part) not be overwriting files.

2. The pypi server can then just pull this repository periodically and serve up the checkout via apache/nginx/whatever static file server.  This can be done via a cron job, or better, a post-push hook on the repo

I'll file the bugs to get this in motion.

Aki Sasaki (not active)

Reporter

Comment 26

•

12 years ago

Dustin, Kim:
Callek mentioned that you were working on a puppet storage solution.
Does that overlap at all here with the pypi solution?
If so, we might want to investigate a shared solution; if not, let's not.

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 27

•

12 years ago

Yes, it's already complete, too :)

Bear opened another bug to talk about this, so we'll get it sorted out.

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 28

•

12 years ago

but 755424, that is

Jeff Hammel

Comment 29

•

12 years ago

So I'm not sure if this and bug 755424 are related or not.  I'll hold on on filing up the IT bugs until this is resolved:

- we need a repository to put packages in
- we need a box that is visible to b.m.o and the outside world that serves the files (and directory indices) in this repo with a static fileserver
-- the checkout on this box should be updated with a post-push hook or a cronjob

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 30

•

12 years ago

We had some discussion of this in IRC.

The core requirement is to have a highly available resource within the releng network from which to install Python packages. These installs would happen dynamically at build time. Related to that is the need for external-to-releng access to the resource for debugging package-dependency issues (that is, exactly replicating the python environment within the releng network). Users should also be able to install everything on their laptop easily.

Good news everyone! We already have one:
https://wiki.mozilla.org/ReleaseEngineering/PuppetAgain/Python

We call pipe from puppet with
--no-deps --no-index --find-links http://repos/python/packages
which has the effect of scanning the Apache-generated index and finding the right .tar.gz file.

Files are added here by releng or relops, following the usual patch process -- a bug, basically. The files are automatically replicated to all puppetmasters, and a change-control email goes out when this occurs.

Problems:

This is *not* in pypi format. The existing puppet stuff uses an explicit list of packages and their versions, rather than following the usual package dependency chain, so pypi format may be required. If so, let's find the simplest way to accomplish it. I'd rather not install a webapp for this purpose, but running a shell script to pypi-ify a directory is fine by me (similar to running createrepo for a yum repository).

This is also not currently HA: if the local puppet master is down, http://repos will not work, and pip will be sad. We don't have the resources to put an HA layer like Zeus between the clients and the server, so the HA behavior needs to be implemented on the client -- given a list of URL prefixes (e.g., in /etc/python-package-servers.txt), the client should try them all, in order, until it finds one that works.

I think we should tackle the problems in that order, as the first may block even a development deploy, while the second only blocks production. I will not have time to do the research here, although I'm happy to help with discussion and testing, and will do the implementation. Jeff, Carl, perhaps you can figure these things out?

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 31

•

12 years ago

heads-up to puppetagain hackers ^^

Carl Meyer (:carljm)

Comment 32

•

12 years ago

"Not in PyPI format" is not, AFAICS, an actual problem. Unless/until the repo is serving ridiculous numbers of packages (quad digits?), there's nothing gained by splitting the index up into directories PyPI-style, and it makes adding packages to the repo more complex. Using "--no-index --find-links URL" where URL is a flat listing of tarballs works just as well as "--index URL" where URL is a full PyPI directory structure. In the end, all pip is doing in either case is scraping HTML pages for links to tarballs, it doesn't matter at all whether those links come directly from --find-links or indirectly from appending package names to an index URL.

The dependency issue is orthogonal. Obviously if you use "--no-deps" it won't pull in dependencies, but if you leave out "--no-deps" and use "--no-index --find-links", pip will happily pull in dependencies (as long as it can find them at the --find-links target URL, of course, which means the person adding the packages needs to add its dependencies, too). (Although for reproducibility reasons I'd still generally recommend installing from a fully-version-pinned, flattened, explicit requirements file that includes all dependencies, as it sounds like "the existing puppet stuff" already does).

TL;DR I don't think the first problem is one. I can't speak as well to the requirements in terms of HA and availability to the outside world.

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 33

•

12 years ago

Awesome, thanks Carl!

Jeff Hammel

Comment 34

•

12 years ago

Running 

pip install -v --no-index --find-links http://people.mozilla.com/~jhammel/findlinks/ talos 

works successfully

Aki Sasaki (not active)

Reporter

Updated

•

12 years ago

Blocks: 756129

Jonathan Griffin (:jgriffin)

Comment 35

•

12 years ago

> This is also not currently HA: if the local puppet master is down, http://repos 
> will not work, and pip will be sad.  We don't have the resources to put an HA > 
> layer like Zeus between the clients and the server, so the HA behavior needs to 
> be implemented on the client -- given a list of URL prefixes (e.g., in 
> /etc/python-package-servers.txt), the client should try them all, in order, until 
> it finds one that works.

Can you clarify this?  If the puppet master is down, will builds still be running?  And if so, where could they get the packages then if http://repos is unavailable?

> Related to that is the need for external-to-releng access to the resource for 
> debugging package-dependency issues..

Could the files in this location be mirrored somewhere public for debugging purposes?

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 36

•

12 years ago

(In reply to Jonathan Griffin (:jgriffin) from comment #35)
> > This is also not currently HA: if the local puppet master is down, http://repos 
> > will not work, and pip will be sad.  We don't have the resources to put an HA > 
> > layer like Zeus between the clients and the server, so the HA behavior needs to 
> > be implemented on the client -- given a list of URL prefixes (e.g., in 
> > /etc/python-package-servers.txt), the client should try them all, in order, until 
> > it finds one that works.
> 
> Can you clarify this?  If the puppet master is down, will builds still be
> running?  And if so, where could they get the packages then if http://repos
> is unavailable?

If the local puppet master is down, builds need to continue running.  Pip runs in the build would get the packages from some other URL than http://repos, based on the list of URL prefixes.  An example might look like
  http://releng-puppet1.srv.releng.scl3.mozilla.com/python/packages
So the functionality that we need on the client is to be able to take a URL list like this.

> > Related to that is the need for external-to-releng access to the resource for 
> > debugging package-dependency issues..
> 
> Could the files in this location be mirrored somewhere public for debugging
> purposes?

That's exactly the purpose, c.f. "for debugging".

Jonathan Griffin (:jgriffin)

Updated

•

12 years ago

Depends on: 757283

Jonathan Griffin (:jgriffin)

Updated

•

12 years ago

Depends on: 757285

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Updated

•

12 years ago

Depends on: 759488

Mike Taylor [:bear]

Comment 38

•

12 years ago

Since my bug was dup'd to this one, let me outline the two things that may not be obvious: 

history - we need a way to maintain older versions (whether by version number or timestamp) so that they can be accessed even if the new hotness has arrived

logging - we need to know who/what/when things are updated/replaced.  so we know who to go beat on when the new hotness has become the OMGWTFBBQ broked'ness

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 39

•

12 years ago

(In reply to Mike Taylor [:bear] from comment #38)
> Since my bug was dup'd to this one, let me outline the two things that may
> not be obvious: 
> 
> history - we need a way to maintain older versions (whether by version
> number or timestamp) so that they can be accessed even if the new hotness
> has arrived

Multiple packages with different versions are fine.

> logging - we need to know who/what/when things are updated/replaced.  so we
> know who to go beat on when the new hotness has become the OMGWTFBBQ
> broked'ness

Change-control emails go out whenever files are updated.

Mike Taylor [:bear]

Comment 40

•

12 years ago

(In reply to Dustin J. Mitchell [:dustin] from comment #39)
> (In reply to Mike Taylor [:bear] from comment #38)
> > Since my bug was dup'd to this one, let me outline the two things that may
> > not be obvious: 
> > 
> > history - we need a way to maintain older versions (whether by version
> > number or timestamp) so that they can be accessed even if the new hotness
> > has arrived
> 
> Multiple packages with different versions are fine.
> 
> > logging - we need to know who/what/when things are updated/replaced.  so we
> > know who to go beat on when the new hotness has become the OMGWTFBBQ
> > broked'ness
> 
> Change-control emails go out whenever files are updated.

my preference would be to make the change control part of the meta data, but it's not a blocker at all and i'll squint at this part as it gets deployed :)

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 41

•

12 years ago

We could certainly use something like git-annex for that purpose, but I don't think it would buy us much, and there are much bigger fires burning.

In general, I think we should expect file storage to be

- mostly uniquely-named, mirrored content (from pypi, yum repos, etc.)
  - files are rarely overwritten
  - unused files need not be deleted
  - if a needed file is missing, you'll get a 404 or equivalent
- where not mirrored, easily reproduced
  - via spec files or instructions in the wiki

If we find ourselves putting content here that we feel might cause brokenness, then it's probably the wrong place for it.  Anything that can cause brokenness should be a change in the puppet manifests, so it can easily be backed out.  That means pinned package version numbers, for example.

Aki Sasaki (not active)

Reporter

Updated

•

12 years ago

Blocks: 650880

Jeff Hammel

Comment 42

•

12 years ago

Does anyone want to own this bug?  It sounds like it is basically done, though maybe follow-ups need to be created (e.g. is anything needed for mozharness?)

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 43

•

12 years ago

I've been treating it as a tracker for the work you've asked of me, so if there's nothing futher required, you can assign it to me and I'll close it when the deps are done.

Jonathan Griffin (:jgriffin)

Comment 44

•

12 years ago

Thanks Dustin.

Assignee: nobody → dustin

John O'Duinn [:joduinn] (please use "needinfo?" flag)

Comment 45

•

12 years ago

Any ETA on this? This blocks bug#713055, which is a Q2 goal.

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 46

•

12 years ago

The block is on bug 759488, which is a releng bug, so no ETA from me.

John O'Duinn [:joduinn] (please use "needinfo?" flag)

Comment 47

•

12 years ago

(In reply to Dustin J. Mitchell [:dustin] from comment #46)
> The block is on bug 759488, which is a releng bug, so no ETA from me.

ok, we'll find an owner to investigate bug#759488. Meanwhile, what other work is needed here or in bug#757285 before we can have this webserver in production?

Aki Sasaki (not active)

Reporter

Comment 48

•

12 years ago

(In reply to Dustin J. Mitchell [:dustin] from comment #46)
> The block is on bug 759488, which is a releng bug, so no ETA from me.

How does pointing clients to it block setting up the server?

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 49

•

12 years ago

Nothing.

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 50

•

12 years ago

(In reply to Aki Sasaki [:aki] from comment #48)
> How does pointing clients to it block setting up the server?

It doesn't - the server's already set up, and has been for a few months.

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 51

•

12 years ago

Adding [tracker] per comment 43 and to reduce confusion.

Summary: create python package webserver → [tracker] create python package webserver

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 52

•

12 years ago

So the public mirror is almost ready (just pending a flow).  Where should I write up docs on how to use it?

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 53

•

12 years ago

OK, this is done as far as I'm concerned, and as far as the dependencies are concerned.  I'm happy to add additional docs, but I'm not sure where they should go.

Status: NEW → RESOLVED

Closed: 12 years ago

Resolution: --- → FIXED

Jonathan Griffin (:jgriffin)

Comment 54

•

12 years ago

(In reply to Dustin J. Mitchell [:dustin] from comment #53)
> OK, this is done as far as I'm concerned, and as far as the dependencies are
> concerned.  I'm happy to add additional docs, but I'm not sure where they
> should go.

Perhaps a page on wiki.mozilla.org, or intranet.mozilla.org, if the docs will include any sensitive info.

Jeff Hammel

Comment 55

•

12 years ago

Maybe something like https://wiki.mozilla.org/Buildbot/PythonPackages would be appropriate? I assume there won't be anything private, at least on the "how to use this thing" side.  Speaking of...how do you use this thing? http://puppetagain.pub.build.mozilla.org/ gives the generic apache start page. Where do I go to actually see packages?

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 56

•

12 years ago

I added it here
  https://wiki.mozilla.org/ReleaseEngineering/PuppetAgain/Python
this really has nothing to do with Buildbot, so I don't think Buildbot/PythonPackages is a good place for it.

Please feel free to point there from other documentation.

Jeff Hammel

Comment 57

•

12 years ago

Excellent! Thanks, Dustin!

Nobody; OK to take it and work on it

Updated

•

11 years ago

Product: mozilla.org → Release Engineering