Closed Bug 1396168 Opened 7 years ago Closed 5 years ago

Firewall exceptions needed for various hardware tests

Categories

(Infrastructure & Operations :: RelOps: General, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 1535467

People

(Reporter: markco, Assigned: markco)

References

Details

Attachments

(2 files)

During clipboard and webGL test there is a request popping up for firewall access for: C:\users\$taskuser\build\tests\bin\ssltunnel.exe C:\users\$taskuser\build\venv\scripts\pyhton.exe This maybe happening on other tests as well. What makes this tricky regard to firewall exceptions is the dynamic creation of the task user. Which means the directory changes with each tests. I am trying to work around it now just for testing purposes, but this will need to be addressed before the Moonshots go into production.
grenade: Do you have any suggestions on how to handle this?
Flags: needinfo?(rthijssen)
not sure what will work, but i would try one or some of: - a program firewall exception with only the program name specified without the path - figure out what ports those two programs are trying to speak on and create port based exceptions - something similar to the symlink ted implemented for builds (https://hg.mozilla.org/mozilla-central/rev/f4c1cb96ec82)
Flags: needinfo?(rthijssen)
this patch adds firewall exceptions for task user programs to the task init scripts
Assignee: relops → rthijssen
Status: NEW → ASSIGNED
Attachment #8906939 - Flags: review?(mcornmesser)
Comment on attachment 8906939 [details] https://github.com/mozilla-releng/OpenCloudConfig/pull/91 Look good. We will also need the same thing doen for WIndows 10.
Attachment #8906939 - Flags: review?(mcornmesser) → review+
merged: https://github.com/mozilla-releng/OpenCloudConfig/commit/d67839dcfda20f096c88d4c640910b805efb80e6 note that 10 is also included (it already had an init script so the firewall exceptions were added to the existing script)
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Rob, I think we might need to roll this back - I believe this is causing bug 1400841. What is your opinion?
Flags: needinfo?(rthijssen)
i agree, we may need to roll it back. we also may need to roll back the g-w update on gecko-t-win10-64-gpu if this is the only place it occurs. since we're having other issues with getting tests functioning on that worker type.
Flags: needinfo?(rthijssen)
So before the upgrade, we weren't reporting crashes to sentry, but they could have been occurring if this task-user-init script was returning a non-zero exit code. We'll see when we roll back this change if we're still getting a non-zero exit code. If it is still non-zero, let's try disabling this task-user-init script before rolling back the worker. I think the only difference between the older version and the newer one, is that the older one isn't reporting crashes to sentry - so I think it might just hide the crash, rather than prevent it. I'm going to be out probably until the end of the week, but back on Monday. But you can call me - I'm having laser eye surgery so can't look at a screen, but if you have questions, I'll be happy for the distraction of talking to someone, rather than being sat in dark rooms wearing sunglasses and not being allowed to look at anything. xD
reverting to g-w 8.3.0 to see if running task-user-init.cmd under the GenericWorker user account (with runAsCurrentUser set to true) resolves the user elevation errors from the init script. https://github.com/mozilla-releng/OpenCloudConfig/commit/3b3b385d449264adba2490393795ae6b19b286aa
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
pmoore: in this task [1], we run the same commands that we attempt to run in the user init script [2]. when run as part of the task payload, the commands succeed. however when they are run as part of task user initialisation, they fail with errors about requiring an elevated command prompt. > [1]: https://tools.taskcluster.net/groups/eZyxC_3cRVGc2quDyhsl_A/tasks/eZyxC_3cRVGc2quDyhsl_A/runs/0/logs/public%2Flogs%2Flive.log > [2]: https://github.com/mozilla-releng/OpenCloudConfig/blob/master/userdata/Configuration/GenericWorker/task-user-init-win10.cmd we rely on the task user initialisation mechanism when a command needs to be run as the task user in order to set up the environment when the change is to part of the user environment as opposed to the machine or system environment. can we patch generic worker to run the task user initialisation script successfully?
Flags: needinfo?(pmoore)
here's a variation on the task where we download the cmd script and run it. this also works when run in the task payload. https://tools.taskcluster.net/groups/SYVljAZzRJGguhFMCkZgTA/tasks/SYVljAZzRJGguhFMCkZgTA/runs/0/logs/public%2Flogs%2Flive.log
mark, here's a try push using in-tree preflight scripts: https://hg.mozilla.org/try/rev/2c6bf8878f006d534cf4c630e99c6b9b88be971d i'm now attempting to set the firewall exceptions as part of the preflight scripts (if the push succeeds). can you let me know if these specific rules (python, ssltunnel: inbound only) are enough to prevent the popup dialogs. i wasn't sure if we also need outbound exceptions or if the outbound calls on hw already succeed.
Flags: needinfo?(pmoore) → needinfo?(mcornmesser)
this seems to work. here's a better try push using the task id for the rule name instead of the username which may not change depending on gw and task configuration and with the paths to programs corrected: https://hg.mozilla.org/try/rev/fc6b9a4e21f1b1fba5cfa4d56b8b688e407fa2e7 we just need to test on hardware rather than ec2 to validate that the rule actually causes the tests to succeed.
I will be able to test this next week.
Flags: needinfo?(mcornmesser)
Assignee: rthijssen → mcornmesser
Rob, could you add outbound from c:\users\*\build\venv\scripts\python.exe to the prefilght scripts?
Flags: needinfo?(rthijssen)
taskcluster windows task users need a couple specific firewall rules aplied to executables created in their task directories. since the path to the task user directory is not known until the task starts, the exceptions must be created during task execution.
i hope this worked. (first time using phabricator)
Flags: needinfo?(rthijssen)
pmoore is looking into adding the firewall exceptions to the task user creation scripts since the attempt to set these from the preflight script fails in tasks that don't have admin access. see: https://mozilla.logbot.info/ci/20180824#c15215510

Status here? Can we r/f?

Flags: needinfo?(rthijssen)
Flags: needinfo?(mcornmesser)

this can't¹ be fixed properly from infra. it should be fixed in generic-worker.

  1. we could possibly work around the problem by running a scheduled task that checks for task directories and creates the required firewall rules, then cleans up old rules left behind by completed tasks. i've actually already written the patch on several occassions (eg: occ pr 230). however, i think that this workaround is ugly and hacky and fragile and a better solution is for generic-worker to implement a mechanism to run elevated set-up and tear-down scripts before and after tasks.
Flags: needinfo?(rthijssen)
Status: REOPENED → RESOLVED
Closed: 7 years ago5 years ago
Flags: needinfo?(mcornmesser)
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: