805016 - integrating Buildbot/Foopies with BMM/MozPool/Lifeguard for Android

Reporter

Description

•

12 years ago

From an email sent a few days ago. This is where we're farthest behind in terms of panda-based Android and B2G support, so we should figure this out ASAP and get moving on the necessary coding and configuration. I'm not sure who in releng is the point person for this. Whoever is, please take the bug? ---- We're still working on getting production hardware in place for the pandas, but this is a good time to work on connecting the systems together. As I understand it, we'll need to put Android into prod immediately when the hardware is available; B2G is not far behind, and foopyless configurations are important to consider but out of scope for implementation at the moment. Let me know if that prioritization is incorrect. BMM/MozPool/Lifeguard are still sorting out what does what, but essentially we'll have an HTTP endpoint to request that a board be power-cycled, and a similar endpoint to request re-imaging, where the request specifies the desired image. For Android, AIUI power-cycling is be part of the production process, clearing the device between runs, while re-imaging is used for automatic failure remediation. For B2G, reimaging would occur on just about every boot. There are provisions in place to pass a JSON "config blob" with the reimage request, which would indicate precisely which B2G image should be downloaded and installed. We don't yet have the live-image scripts required to install B2G. As Mark and I work out how BMM, MozPool, and Lifeguard work together, it'd be helpful to have releng's integration vision. So, Callek, Aki, and/or Kim (or who?), in broad strokes, how do you see this process working? Random points to jog your thoughts: * We could add BMM servers for tegras, too, to make a single reboot API that would work for both (automatically reimaging tegras is not currently possible) * Will foopies "check out" a particular board? What happens if that board is not functional? When does the board get rebooted? In a B2G context, when does it get reimaged? What happens if the reboot or reimage fails? * BMM servers are co-located with the hardware they manage; there's no central server. If you hit one BMM server with a request for a board it doesn't manage, it will redirect (302) to the correct BMM server. BMM servers for each board are also listed in inventory. So, if foopies hit BMM servers directly, it will need to be a bit more complex than a single curl or python-requests call, but not too bad. So, let me know what you're thinking and planning, and we'll bring this together. I don't think we face any particularly challenging coding issues, once we agree on a design. And we have a surfeit of coders, so we should be in good shape.

Dustin J. Mitchell [:dustin] (he/him)

Reporter

Comment 1

•

12 years ago

(the initial email was only to a few people, and probably I guessed the wrong people - don't worry if you're missing it, as it's copied in full above)

Melissa O'Connor [:melissa]

Updated

•

12 years ago

Summary: integrating Buildbot/Foopies with BMM/MozPool/Lifeguard → integrating Buildbot/Foopies with BMM/MozPool/Lifeguard for Android

Armen [:armenzg]

Comment 2

•

12 years ago

I think that after running verify.py by clientproxy.py we can determine if a board needs to be re-imaged and talk with bmm. > * We could add BMM servers for tegras, too, to make a single reboot API that would > work for both (automatically reimaging tegras is not currently possible) > I would suggest not to get entangled with tegras until we iron everything out with pandas Android/b2g but it seems like a great idea to keep the pool at maximum. > * Will foopies "check out" a particular board? What happens if that board is not > functional? When does the board get rebooted? In a B2G context, when does it get > reimaged? What happens if the reboot or reimage fails? > clientproxy.py does not currently checkout boards. Can we assume a board not functional after re-imaging once and then not being able to pass verify.py? I will defer the other questions to kmoir and Callek. > * BMM servers are co-located with the hardware they manage; there's no central > server. If you hit one BMM server with a request for a board it doesn't manage, > it will redirect (302) to the correct BMM server. BMM servers for each board are > also listed in inventory. So, if foopies hit BMM servers directly, it will need > to be a bit more complex than a single curl or python-requests call, but not too > bad. > This sounds great! I don't see a question on this last point but more of a FYI.

Armen [:armenzg]

Comment 3

•

12 years ago

This is for reference: https://wiki.mozilla.org/ReleaseEngineering/BlackMobileMagic (bmm) https://wiki.mozilla.org/Auto-tools/Projects/Lifeguard https://wiki.mozilla.org/Auto-tools/Projects/MozPool

Hal Wine [:hwine] (use NI)

Comment 4

•

12 years ago

We may have a fairly large disconnect on these projects, based on prior discussions with folks not on the bug. I'm going to take that discussion to email to sort out, and will post the result here. fwiw, the basic understanding from the releng point of view is: - pandas-for-android need none of this. ateam may use some of the back end to implement bug 797868 (s/a bug 797868) - pandas-for-b2g will need buildbot control of reimaging (bmm?). The API releng would use has yet to be designed - discussions are happening in email with ateam about that API. ATM, releng isn't aware of any direct interaction between our systems and either the lifeguard or mozpool efforts. If there's a disconnect, we'll find it in the email, and that will be a good thing!

Amy Rich [:arr] [:arich]

Comment 5

•

12 years ago

Please be sure to include dustin and dividehex on this mail thread.

Dustin J. Mitchell [:dustin] (he/him)

Reporter

Comment 6

•

12 years ago

https://etherpad.mozilla.org/panda-b2g-imaging From Hal's action items at the bottom: * dustin to coordinate with kmoir as existing panda chassis connected to BMM (primarily for dcops usage atm). * relops/ateam to focus on android issues first (as we're expecting some android image changes over the next bit of time) * b2g images coming w/in a week (missed date from Clint)

Status: NEW → RESOLVED

Closed: 12 years ago

Resolution: --- → FIXED

Nobody; OK to take it and work on it

Assignee

Updated

•

11 years ago

Product: mozilla.org → Release Engineering

Bugzilla

Quick Search

integrating Buildbot/Foopies with BMM/MozPool/Lifeguard for Android

Categories

(Release Engineering :: General, defect)

Tracking

(Not tracked)

People

(Reporter: dustin, Unassigned)

References

Details

Crash Data

Security

(public)

User Story

Description

Comment 1

Updated

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Updated