Closed
Bug 906933
Opened 11 years ago
Closed 11 years ago
Failing to connect/reconnect to cell data
Categories
(Firefox OS Graveyard :: General, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: stephend, Unassigned)
References
Details
(Whiteboard: [fromAutomation])
Attachments
(6 files)
Gecko: http://hg.mozilla.org/mozilla-central/rev/c8c9bd74cc40
Gaia: https://github.com/mozilla-b2g/gaia/commit/f6de05c135913f2cb790759335875bb1b3c4f9bb
Device/build: Unagi, on master
Looks like bug 901844 is suspect?
Test: https://github.com/mozilla/gaia-ui-tests/blob/master/gaiatest/tests/unit/test_connect_to_network.py
gaiatest --testvars=/home/webqa/webqa-credentials/b2g/b2g-12.json --address=localhost:2828 --xml-output=results/results.xml --html-output=results/index.html --restart --repeat=0 gaiatest/tests/unit/test_connect_to_network.py
running webserver on http://10.250.1.87:39594/
TEST-START test_connect_to_network.py
test_connect_to_network (test_connect_to_network.TestConnectToNetwork) ... ERROR
======================================================================
ERROR: test_connect_to_network (test_connect_to_network.TestConnectToNetwork)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/var/jenkins/workspace/b2g.unagi.mozril.gaia.master.ui.run_adhoc_tests/.env/local/lib/python2.7/site-packages/marionette_client-0.5.36-py2.7.egg/marionette/marionette_test.py", line 132, in run
testMethod()
File "/var/jenkins/workspace/b2g.unagi.mozril.gaia.master.ui.run_adhoc_tests/gaiatest/tests/unit/test_connect_to_network.py", line 12, in test_connect_to_network
self.connect_to_network()
File "/var/jenkins/workspace/b2g.unagi.mozril.gaia.master.ui.run_adhoc_tests/gaiatest/gaia_test.py", line 534, in connect_to_network
self.data_layer.connect_to_cell_data()
File "/var/jenkins/workspace/b2g.unagi.mozril.gaia.master.ui.run_adhoc_tests/gaiatest/gaia_test.py", line 226, in connect_to_cell_data
result = self.marionette.execute_async_script("return GaiaDataLayer.connectToCellData()", special_powers=True)
File "/var/jenkins/workspace/b2g.unagi.mozril.gaia.master.ui.run_adhoc_tests/.env/local/lib/python2.7/site-packages/marionette_client-0.5.36-py2.7.egg/marionette/marionette.py", line 642, in execute_async_script
filename=os.path.basename(frame[0]))
File "/var/jenkins/workspace/b2g.unagi.mozril.gaia.master.ui.run_adhoc_tests/.env/local/lib/python2.7/site-packages/marionette_client-0.5.36-py2.7.egg/marionette/marionette.py", line 352, in _send_message
self._handle_error(response)
File "/var/jenkins/workspace/b2g.unagi.mozril.gaia.master.ui.run_adhoc_tests/.env/local/lib/python2.7/site-packages/marionette_client-0.5.36-py2.7.egg/marionette/marionette.py", line 401, in _handle_error
raise ScriptTimeoutException(message=message, status=status, stacktrace=stacktrace)
TEST-UNEXPECTED-FAIL | test_connect_to_network.py TestConnectToNetwork.test_connect_to_network | ScriptTimeoutException: timed out
----------------------------------------------------------------------
Ran 1 test in 165.194s
I'll also attach the full logcat
Reporter | ||
Updated•11 years ago
|
Severity: normal → critical
Flags: needinfo?(frsela)
Keywords: regression
OS: Windows 7 → Gonk (Firefox OS)
Hardware: x86_64 → ARM
Comment 1•11 years ago
|
||
I don't think bug 901844 would likely be the culprit here. That bug focuses on SIM toolkit, where as this is talking cell data.
Comment 2•11 years ago
|
||
Analyzing the Jenkins test failures, the browser cell data test started failing on August 16th midday. That would imply the regression happened on August 15th.
Comment 3•11 years ago
|
||
Hsin-Yi - We're seeing continuous failures with connecting to cellular data since August 16th. I see that you've worked on the ril_worker.js recently in the changelog - http://hg.mozilla.org/mozilla-central/filelog/b7f636fada9f/dom/system/gonk/ril_worker.js. Do you have any ideas what caused this regression? Any assistance here would be greatly appreciated to help get our cell data tests back up and running.
Flags: needinfo?(htsai)
Comment 4•11 years ago
|
||
(In reply to Jason Smith [:jsmith] from comment #3)
> Hsin-Yi - We're seeing continuous failures with connecting to cellular data
> since August 16th. I see that you've worked on the ril_worker.js recently in
> the changelog -
> http://hg.mozilla.org/mozilla-central/filelog/b7f636fada9f/dom/system/gonk/
> ril_worker.js. Do you have any ideas what caused this regression? Any
> assistance here would be greatly appreciated to help get our cell data tests
> back up and running.
Hi Jason, thanks for the information. But I don't think the change in bug 905076 is the cause for cellular data failure. That bug is adding a condition check to see if the network is gsm/cdma when dialing out a voice call, not touching data connection.
Flags: needinfo?(htsai)
Comment 5•11 years ago
|
||
(In reply to Hsin-Yi Tsai [:hsinyi] from comment #4)
> (In reply to Jason Smith [:jsmith] from comment #3)
> > Hsin-Yi - We're seeing continuous failures with connecting to cellular data
> > since August 16th. I see that you've worked on the ril_worker.js recently in
> > the changelog -
> > http://hg.mozilla.org/mozilla-central/filelog/b7f636fada9f/dom/system/gonk/
> > ril_worker.js. Do you have any ideas what caused this regression? Any
> > assistance here would be greatly appreciated to help get our cell data tests
> > back up and running.
>
> Hi Jason, thanks for the information. But I don't think the change in bug
> 905076 is the cause for cellular data failure. That bug is adding a
> condition check to see if the network is gsm/cdma when dialing out a voice
> call, not touching data connection.
Right. The regressing patch here would have landed on August 15th, so it wouldn't be caused by bug 905076. I think what we're trying to figure out here is that there was something that landed in the RIL code around August 15th that would have caused this regression.
Comment 6•11 years ago
|
||
(In reply to Jason Smith [:jsmith] from comment #5)
> (In reply to Hsin-Yi Tsai [:hsinyi] from comment #4)
> > (In reply to Jason Smith [:jsmith] from comment #3)
> > > Hsin-Yi - We're seeing continuous failures with connecting to cellular data
> > > since August 16th. I see that you've worked on the ril_worker.js recently in
> > > the changelog -
> > > http://hg.mozilla.org/mozilla-central/filelog/b7f636fada9f/dom/system/gonk/
> > > ril_worker.js. Do you have any ideas what caused this regression? Any
> > > assistance here would be greatly appreciated to help get our cell data tests
> > > back up and running.
> >
> > Hi Jason, thanks for the information. But I don't think the change in bug
> > 905076 is the cause for cellular data failure. That bug is adding a
> > condition check to see if the network is gsm/cdma when dialing out a voice
> > call, not touching data connection.
>
> Right. The regressing patch here would have landed on August 15th, so it
> wouldn't be caused by bug 905076. I think what we're trying to figure out
> here is that there was something that landed in the RIL code around August
> 15th that would have caused this regression.
I quickly overviewed the change log but nothing looks obviously, though bug 903257 might be a possible one. I am ccing other RIL team members, hopefully we can get more hints here. And this bug happens on automation test, not on physical machines right? Thank you.
Reporter | ||
Comment 7•11 years ago
|
||
(In reply to Hsin-Yi Tsai [:hsinyi] from comment #6)
> I quickly overviewed the change log but nothing looks obviously, though bug
> 903257 might be a possible one. I am ccing other RIL team members, hopefully
> we can get more hints here. And this bug happens on automation test, not on
> physical machines right? Thank you.
This is on real Unagis, but on automation through Python (calling async-JavaScript), right. There might be a way to manually trigger it, but is definitely 100% through automation.
Comment 8•11 years ago
|
||
Some observation here:
From the log, seems RIL code didn't receive APN settings sending from Gaia. So it fails in setting up data call.
=============================
E/GeckoConsole( 446): Content JS LOG at dummy file:499 in GaiaDataLayer.setSetting: setting ril.data.enabled to true
I/Gecko ( 446): -*- RadioInterface[0]: 'ril.data.enabled' is now true
I/Gecko ( 446): -*- RadioInterface[0]: We haven't gotten completely the APN data.
============================
Also, RIL is waiting for 'ril.data.apnSettings' but I didn't see 'ril.data.apnSettings' in the log. I saw only 'ril.data.apn' 'ril.data.user' and 'ril.data.passwd' but they are old format, not what RIL expects now.
Comment 9•11 years ago
|
||
Unagi pvt m-c eng build.
Gaia: f6de05c135913f2cb790759335875bb1b3c4f9bb
Gecko: http://hg.mozilla.org/mozilla-central/rev/c8c9bd74cc40
BuildID 20130819040203
Version 26.0a1
Chunghwa SIM card works fine.
TaiwanMobile SIM card works fine, attach the log file.
Comment 10•11 years ago
|
||
(In reply to Hsin-Yi Tsai [:hsinyi] from comment #8)
> Some observation here:
>
> From the log, seems RIL code didn't receive APN settings sending from Gaia.
> So it fails in setting up data call.
> =============================
> E/GeckoConsole( 446): Content JS LOG at dummy file:499 in
> GaiaDataLayer.setSetting: setting ril.data.enabled to true
> I/Gecko ( 446): -*- RadioInterface[0]: 'ril.data.enabled' is now true
> I/Gecko ( 446): -*- RadioInterface[0]: We haven't gotten completely the
> APN data.
> ============================
>
> Also, RIL is waiting for 'ril.data.apnSettings' but I didn't see
> 'ril.data.apnSettings' in the log. I saw only 'ril.data.apn' 'ril.data.user'
> and 'ril.data.passwd' but they are old format, not what RIL expects now.
After trace the log, GaiaDataLayer set the "ril.data.apn", "ril.data.passwd" and "ril.data.user".
Bug 850555 and Bug 837488 changed the setting architecture, it might be the possible root cause.
Comment 11•11 years ago
|
||
Hi Hsin yi,
Reproduced on
Gaia: f6de05c135913f2cb790759335875bb1b3c4f9bb
B-D 2013-08-19 18:36:38
Gecko: http://hg.mozilla.org/mozilla-central/rev/c8c9bd74cc40
BuildID 20130819040203
Version 26.0a1
Try three times on gaiatest/tests/browser/test_browser_cell_data.py, 2 passed and 1 failed and log will be attached. Please let me know any other info I can provide.
Hi Stephen,
I found that volumn will be changed to either mute or vibration in test_browser_cell_data.py, is this correct step?
Comment 12•11 years ago
|
||
Comment 13•11 years ago
|
||
Updated•11 years ago
|
Attachment #792648 -
Attachment mime type: text/x-log → text/plain
Comment 14•11 years ago
|
||
(In reply to echu from comment #11)
> Hi Stephen,
>
> I found that volumn will be changed to either mute or vibration in
> test_browser_cell_data.py, is this correct step?
Yes, it is correct.
Gaiatest will set the volume of following channels (master, content, notification, alarm, telephony, and bt_sco) to zero.
Comment 15•11 years ago
|
||
Attached failed case summary of test_browser_cell_data.py test.
starting httpd
running webserver on http://10.247.24.84:46235/
TEST-START test_browser_cell_data.py
test_browser_cell_data (test_browser_cell_data.TestBrowserCellData) ... ERROR
======================================================================
ERROR: test_browser_cell_data (test_browser_cell_data.TestBrowserCellData)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/marionette_client-0.5.36-py2.7.egg/marionette/marionette_test.py", line 123, in run
self.setUp()
File "/home/enpei/FirefoxOS/gaia-ui-tests/gaiatest/tests/browser/test_browser_cell_data.py", line 16, in setUp
GaiaTestCase.setUp(self)
File "/home/enpei/FirefoxOS/gaia-ui-tests/gaiatest/gaia_test.py", line 454, in setUp
self.cleanUp()
File "/home/enpei/FirefoxOS/gaia-ui-tests/gaiatest/gaia_test.py", line 503, in cleanUp
self.data_layer.enable_wifi()
File "/home/enpei/FirefoxOS/gaia-ui-tests/gaiatest/gaia_test.py", line 250, in enable_wifi
result = self.marionette.execute_async_script("return GaiaDataLayer.enableWiFi()", special_powers=True)
File "/usr/local/lib/python2.7/dist-packages/marionette_client-0.5.36-py2.7.egg/marionette/marionette.py", line 642, in execute_async_script
filename=os.path.basename(frame[0]))
File "/usr/local/lib/python2.7/dist-packages/marionette_client-0.5.36-py2.7.egg/marionette/marionette.py", line 352, in _send_message
self._handle_error(response)
File "/usr/local/lib/python2.7/dist-packages/marionette_client-0.5.36-py2.7.egg/marionette/marionette.py", line 401, in _handle_error
raise ScriptTimeoutException(message=message, status=status, stacktrace=stacktrace)
TEST-UNEXPECTED-FAIL | test_browser_cell_data.py TestBrowserCellData.test_browser_cell_data | ScriptTimeoutException: timed out
----------------------------------------------------------------------
Ran 1 test in 73.563s
FAILED (errors=1)
SUMMARY
-------
passed: 0
failed: 1
todo: 0
FAILED TESTS
Comment 16•11 years ago
|
||
(In reply to echu from comment #15)
> Attached failed case summary of test_browser_cell_data.py test.
>
> starting httpd
> running webserver on http://10.247.24.84:46235/
> TEST-START test_browser_cell_data.py
> test_browser_cell_data (test_browser_cell_data.TestBrowserCellData) ... ERROR
>
> ======================================================================
> ERROR: test_browser_cell_data (test_browser_cell_data.TestBrowserCellData)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
> File
> "/usr/local/lib/python2.7/dist-packages/marionette_client-0.5.36-py2.7.egg/
> marionette/marionette_test.py", line 123, in run
> self.setUp()
> File
> "/home/enpei/FirefoxOS/gaia-ui-tests/gaiatest/tests/browser/
> test_browser_cell_data.py", line 16, in setUp
> GaiaTestCase.setUp(self)
> File "/home/enpei/FirefoxOS/gaia-ui-tests/gaiatest/gaia_test.py", line
> 454, in setUp
> self.cleanUp()
> File "/home/enpei/FirefoxOS/gaia-ui-tests/gaiatest/gaia_test.py", line
> 503, in cleanUp
> self.data_layer.enable_wifi()
> File "/home/enpei/FirefoxOS/gaia-ui-tests/gaiatest/gaia_test.py", line
> 250, in enable_wifi
> result = self.marionette.execute_async_script("return
> GaiaDataLayer.enableWiFi()", special_powers=True)
> File
> "/usr/local/lib/python2.7/dist-packages/marionette_client-0.5.36-py2.7.egg/
> marionette/marionette.py", line 642, in execute_async_script
> filename=os.path.basename(frame[0]))
> File
> "/usr/local/lib/python2.7/dist-packages/marionette_client-0.5.36-py2.7.egg/
> marionette/marionette.py", line 352, in _send_message
> self._handle_error(response)
> File
> "/usr/local/lib/python2.7/dist-packages/marionette_client-0.5.36-py2.7.egg/
> marionette/marionette.py", line 401, in _handle_error
> raise ScriptTimeoutException(message=message, status=status,
> stacktrace=stacktrace)
> TEST-UNEXPECTED-FAIL | test_browser_cell_data.py
> TestBrowserCellData.test_browser_cell_data | ScriptTimeoutException: timed
> out
> ----------------------------------------------------------------------
> Ran 1 test in 73.563s
>
> FAILED (errors=1)
>
> SUMMARY
> -------
> passed: 0
> failed: 1
> todo: 0
>
> FAILED TESTS
It fail at setUp() before running the test.
The setUp will try to enable Wifi, forget all networks, and then disable Wifi.
Comment 17•11 years ago
|
||
You can try to apply the patch in Bug 906595. It seems work for me.
Reporter | ||
Updated•11 years ago
|
Blocks: b2g-central-dogfood
Reporter | ||
Comment 18•11 years ago
|
||
The fix in bug 906595, which we took in https://pvtbuilds.mozilla.org/pub/mozilla.org/b2g/nightly/mozilla-central-unagi-eng/2013/08/2013-08-20-17-08-24/unagi.zip, doesn't fix the issue.
We still see it; please see http://qa-selenium.mv.mozilla.com:8080/view/B2G%20Unagi/job/b2g.unagi.mozril.gaia.master.ui.run_adhoc_tests/28/console:
ERROR: test_browser_cell_data (test_browser_cell_data.TestBrowserCellData)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/var/jenkins/workspace/b2g.unagi.mozril.gaia.master.ui.run_adhoc_tests/.env/local/lib/python2.7/site-packages/marionette_client-0.5.36-py2.7.egg/marionette/marionette_test.py", line 123, in run
self.setUp()
File "/var/jenkins/workspace/b2g.unagi.mozril.gaia.master.ui.run_adhoc_tests/gaiatest/tests/browser/test_browser_cell_data.py", line 17, in setUp
self.data_layer.connect_to_cell_data()
File "/var/jenkins/workspace/b2g.unagi.mozril.gaia.master.ui.run_adhoc_tests/gaiatest/gaia_test.py", line 226, in connect_to_cell_data
result = self.marionette.execute_async_script("return GaiaDataLayer.connectToCellData()", special_powers=True)
File "/var/jenkins/workspace/b2g.unagi.mozril.gaia.master.ui.run_adhoc_tests/.env/local/lib/python2.7/site-packages/marionette_client-0.5.36-py2.7.egg/marionette/marionette.py", line 642, in execute_async_script
filename=os.path.basename(frame[0]))
File "/var/jenkins/workspace/b2g.unagi.mozril.gaia.master.ui.run_adhoc_tests/.env/local/lib/python2.7/site-packages/marionette_client-0.5.36-py2.7.egg/marionette/marionette.py", line 352, in _send_message
self._handle_error(response)
File "/var/jenkins/workspace/b2g.unagi.mozril.gaia.master.ui.run_adhoc_tests/.env/local/lib/python2.7/site-packages/marionette_client-0.5.36-py2.7.egg/marionette/marionette.py", line 401, in _handle_error
raise ScriptTimeoutException(message=message, status=status, stacktrace=stacktrace)
TEST-UNEXPECTED-FAIL | test_browser_cell_data.py TestBrowserCellData.test_browser_cell_data | ScriptTimeoutException: timed out
----------------------------------------------------------------------
Ran 1 test in 104.065s
Reporter | ||
Comment 19•11 years ago
|
||
Comment 20•11 years ago
|
||
(In reply to Stephen Donner [:stephend] from comment #19)
> Created attachment 793370 [details]
> logcat.txt for post-906933-fix
Looks the reason for the test timed out is still the same as comment 10. I see the message "We haven't gotten completely the APN data." all over the log. That said, RIL didn't get the complete APN settings to setup data call. So, even GaiaDataLayer.setSetting: setting ril.data.enabled to true, RIL is never able to setup data call.
======
E/GeckoConsole( 447): Content JS LOG at dummy file:499 in GaiaDataLayer.setSetting: setting ril.data.enabled to true
I/Gecko ( 447): -*- RadioInterface[0]: 'ril.data.enabled' is now true
I/Gecko ( 447): -*- RadioInterface[0]: We haven't gotten completely the APN data.
E/GeckoConsole( 447): Content JS LOG at dummy file:502 in GaiaDataLayer.setSetting/req.onsuccess: setting changed
=======
I glanced over the files and directories in "gaia/tests" but I am not sure I've seen where gaia tests assign/send 'ril.data.apnSettings' to RIL.
Comment 21•11 years ago
|
||
Hi Stephen,
I also run the test_browser_cell_data on latest build, bug 906595 is not reproducible and this issue is not found for running 11 times. Based on Askeing and HsinYi's comment 10 and comment 19, can you check the setting values in testvars.json again to see if this issue can be solved?
Gaia: 70033ba9cbac65551f68fcb3a28f7daf3105e6ff
B-D 2013-08-21 20:03:04
Gecko: http://hg.mozilla.org/mozilla-central/rev/b2486721572e
BuildID 20130821050136
Version 26.0a1
Enpei
Comment 22•11 years ago
|
||
Reporter | ||
Updated•11 years ago
|
Flags: needinfo?(frsela)
Reporter | ||
Comment 23•11 years ago
|
||
This was indeed a RIL-APN settings issue, so with Al's help (thanks, Al!), we updated as such:
- "ril.data.apn": "wap.cingular",
- "ril.data.user": "WAP@CINGULARGPRS.COM",
- "ril.data.passwd": "CINGULAR1"
+ "ril.data.apnSettings": [[{
+ "apn": "wap.cingular",
+ "user": "WAP@CINGULARGPRS.COM",
+ "passwd": "CINGULAR1",
+ "types": ["default","mms","supl"]
+ }]]
Marking as FIXED - thanks to all for your help and patience in investigation and working with us on getting the "fix" (changes) in place!
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Reporter | ||
Updated•11 years ago
|
Keywords: regression
You need to log in
before you can comment on or make changes to this bug.
Description
•