Closed Bug 1558152 Opened 5 years ago Closed 4 years ago

Chain of Trust: newer project.yml isn't automatically fetched

Categories

(Release Engineering :: Release Automation: Other, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: jlorenzo, Unassigned)

References

Details

In bug 1555777, mozilla-esr68 was added to projects.yml[1]. Sadly, scriptworker didn't pick this change up[2]:

2019-06-07T04:18:57    ERROR - Error while rebuilding signing:parent RLN9g7W7QzmldnmYFFycPg task definition!
Traceback (most recent call last):
  File "/builds/scriptworker/lib/python3.6/site-packages/scriptworker/cot/verify.py", line 1535, in verify_parent_task_definition
    chain, parent_link, decision_link, tasks_for
  File "/builds/scriptworker/lib/python3.6/site-packages/scriptworker/cot/verify.py", line 1499, in get_jsone_context_and_template
    chain, parent_link, decision_link, tasks_for
  File "/builds/scriptworker/lib/python3.6/site-packages/scriptworker/cot/verify.py", line 1292, in populate_jsone_context
    jsone_context['repository']['level'] = await get_scm_level(chain.context, project)
  File "/builds/scriptworker/lib/python3.6/site-packages/scriptworker/cot/verify.py", line 989, in get_scm_level
    level = context.projects[project]['access'].replace("scm_level_", "")
KeyError: 'mozilla-esr68'

It's because Chain of Trust caches that file, unless you pass force argument[1]. force is just passed down in unit tests. Therefore, we need to restart all scriptworker processes every time we expand this file.

[1] https://hg.mozilla.org/ci/ci-configuration/file/cf1265c33b9aed70d9e0f826d6fbec9119a89596/projects.yml#l69
[2] https://treeherder.mozilla.org/#/jobs?repo=mozilla-esr68&group_state=expanded&revision=b623b7cc2ae896249b3067fa66b40d04fcd16c6b&selectedJob=250540677
[3] https://github.com/mozilla-releng/scriptworker/blob/0b21b0d4e616de4d04a8eb85e7cbf061c2e9c57a/scriptworker/context.py#L218

I see the next build signing jobs passed. I assume scriptworker processes were restarted.

I've documented this issue in this bug, at least for posteriority. Aki, do you think we should address this caching issue, or we're fine living with it as long as it's known/documented? I'm fine with both.

Hm.

For puppet, we could potentially have puppet restart scriptworker if the file is updated: this would be some sort of script that updates a ci-configuration clone and runs a checksum on the projects.

For the daemon, we could have a TTL for self.projects (e.g., re-download and update the cache every 24hrs).

I’m also ok with documenting this issue if it only really happens once every esr cycle. I imagine it’s more frequent with mobile; we’ll need to determine if we add repos regularly that don’t get cot constants.py changes as well. If we need to land a cot constants.py change, that will restart scriptworker, so we just need to make sure we land the projects.yml change first.

This instance did require a constants change, which led to the scriptworkers restarting.

Good news: we don't use puppet anymore for scriptworker. If I'm not mistaken, we scale our instances down to 0 every time there's no task to do. Thus, restarting scriptworker is more often done, dwarfing the impact of this bug. Closing it.

Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.