Chain of Trust: newer project.yml isn't automatically fetched
Categories
(Release Engineering :: Release Automation: Other, defect)
Tracking
(Not tracked)
People
(Reporter: jlorenzo, Unassigned)
References
Details
In bug 1555777, mozilla-esr68
was added to projects.yml
[1]. Sadly, scriptworker didn't pick this change up[2]:
2019-06-07T04:18:57 ERROR - Error while rebuilding signing:parent RLN9g7W7QzmldnmYFFycPg task definition!
Traceback (most recent call last):
File "/builds/scriptworker/lib/python3.6/site-packages/scriptworker/cot/verify.py", line 1535, in verify_parent_task_definition
chain, parent_link, decision_link, tasks_for
File "/builds/scriptworker/lib/python3.6/site-packages/scriptworker/cot/verify.py", line 1499, in get_jsone_context_and_template
chain, parent_link, decision_link, tasks_for
File "/builds/scriptworker/lib/python3.6/site-packages/scriptworker/cot/verify.py", line 1292, in populate_jsone_context
jsone_context['repository']['level'] = await get_scm_level(chain.context, project)
File "/builds/scriptworker/lib/python3.6/site-packages/scriptworker/cot/verify.py", line 989, in get_scm_level
level = context.projects[project]['access'].replace("scm_level_", "")
KeyError: 'mozilla-esr68'
It's because Chain of Trust caches that file, unless you pass force
argument[1]. force
is just passed down in unit tests. Therefore, we need to restart all scriptworker processes every time we expand this file.
[1] https://hg.mozilla.org/ci/ci-configuration/file/cf1265c33b9aed70d9e0f826d6fbec9119a89596/projects.yml#l69
[2] https://treeherder.mozilla.org/#/jobs?repo=mozilla-esr68&group_state=expanded&revision=b623b7cc2ae896249b3067fa66b40d04fcd16c6b&selectedJob=250540677
[3] https://github.com/mozilla-releng/scriptworker/blob/0b21b0d4e616de4d04a8eb85e7cbf061c2e9c57a/scriptworker/context.py#L218
Reporter | ||
Comment 1•5 years ago
|
||
I see the next build signing jobs passed. I assume scriptworker processes were restarted.
I've documented this issue in this bug, at least for posteriority. Aki, do you think we should address this caching issue, or we're fine living with it as long as it's known/documented? I'm fine with both.
Comment 2•5 years ago
|
||
Hm.
For puppet, we could potentially have puppet restart scriptworker if the file is updated: this would be some sort of script that updates a ci-configuration clone and runs a checksum on the projects.
For the daemon, we could have a TTL for self.projects (e.g., re-download and update the cache every 24hrs).
I’m also ok with documenting this issue if it only really happens once every esr cycle. I imagine it’s more frequent with mobile; we’ll need to determine if we add repos regularly that don’t get cot constants.py changes as well. If we need to land a cot constants.py change, that will restart scriptworker, so we just need to make sure we land the projects.yml change first.
Comment 3•5 years ago
|
||
This instance did require a constants change, which led to the scriptworkers restarting.
Reporter | ||
Comment 4•4 years ago
|
||
Good news: we don't use puppet anymore for scriptworker. If I'm not mistaken, we scale our instances down to 0 every time there's no task to do. Thus, restarting scriptworker is more often done, dwarfing the impact of this bug. Closing it.
Description
•