773482 - (b2g-reftest) Tracking Bug to enable reftests on B2G

Reporter

Description

•

12 years ago

Many reftests fail on B2G. This is a tracking bug to triage, fix and enable/skip them.

Andrew Halberstadt [:ahal]

Reporter

Updated

•

12 years ago

Depends on: 774396

Andrew Halberstadt [:ahal]

Reporter

Updated

•

12 years ago

Blocks: 770490

Andrew Halberstadt [:ahal]

Reporter

Updated

•

12 years ago

Depends on: 774405

Andrew Halberstadt [:ahal]

Reporter

Updated

•

12 years ago

Depends on: 774682

Andrew Halberstadt [:ahal]

Reporter

Comment 1

•

12 years ago

Creating a bug per manifest doesn't make a whole lot of sense. Instead I'm uploading the raw logs, and will skip-if(B2G) failing tests, referencing this bug. Tests can be enabled/modified to random/etc as I do more testing on the machines that will run these in C-I.

Raw logs:
http://people.mozilla.com/~ahalberstadt/reftest/170712_reftest_logs.zip

HTML formatted output:
http://people.mozilla.com/~ahalberstadt/reftest/170712_reftest_html.zip

Andrew Halberstadt [:ahal]

Reporter

Comment 2

•

12 years ago

Attached file list of reftests that fail on B2g (deleted) — Details

Note, these are not including the subset of reftests that are disabled for fennec.

Jonathan Griffin (:jgriffin)

Comment 3

•

12 years ago

(In reply to Andrew Halberstadt [:ahal] from comment #2)
> Created attachment 643144 [details]
> list of reftests that fail on B2g
> 
> Note, these are not including the subset of reftests that are disabled for
> fennec.

For the record, the number of failing tests is currently 1313.

Joel Maher ( :jmaher ) (UTC -8)

Comment 4

•

12 years ago

is that consistent if you run it over and over again?

Andrew Halberstadt [:ahal]

Reporter

Comment 5

•

12 years ago

I would say almost certainly no. For now I'm going to disable these and try and get a green run. Then I'll set up the reftest desktop that arrived in toronto and run everything a bunch of times to try and nail down random vs failures.

It takes about 4 hours to run on my local machine (due to having to save all the images to data urls on failure)

Andrew Halberstadt [:ahal]

Reporter

Comment 6

•

12 years ago

I'm seeing the same behaviour that Joel did for native android. If I skip failing tests a new set of failing tests just crops up. The scary thing is that I'm restarting the emulator between each subdirectory so they shouldn't be affecting each other.

I'll try getting them running on a panda board next week to see if that is any different.

Andrew Halberstadt [:ahal]

Reporter

Comment 7

•

12 years ago

I would say almost certainly no. For now I'm going to disable these and try and get a green run. Then I'll set up the reftest desktop and run everything a bunch of times to try and nail down random vs failures.

It takes about 4 hours to run on my local machine (due to having to save all the images to data urls on failure)

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 8

•

12 years ago

What desktop are you running the emulator on?  If it's desktop-linux-NVIDIA, I would expect many failures because we don't test rendering at all on that gpu, and additional errors can creep in from the emulator translation layer.

(In reply to Andrew Halberstadt [:ahal] from comment #6)
> that I'm restarting the emulator between each subdirectory so they shouldn't
> be affecting each other.

That should not be necessary at all, and is probably why the tests take 4 hours to run ;).

(In reply to Andrew Halberstadt [:ahal] from comment #7)
> I would say almost certainly no. For now I'm going to disable these and try
> and get a green run.

Let's not start disabling until we triage the failures and decide how much we're going to support the desktop-linux-emulator setup.  If the pandaboard is a long way off, then we'll probably need to support these to some degree.

I'll help with this.

Andrew Halberstadt [:ahal]

Reporter

Comment 9

•

12 years ago

(In reply to Chris Jones [:cjones] [:warhammer] from comment #8)
> What desktop are you running the emulator on?  If it's desktop-linux-NVIDIA,
> I would expect many failures because we don't test rendering at all on that
> gpu, and additional errors can creep in from the emulator translation layer.

Yes, my laptop and the desktop are both Linux with Nvidia GT218 and Nvidia GT216 cards respectively. I can order a new card if it becomes a big problem.

> That should not be necessary at all, and is probably why the tests take 4
> hours to run ;).

It shouldn't be necessary, but on my laptop at least the order of tests that got run had a dramatic affect on which ones passed and failed. The failures had nothing to do with the tests either, if they got disabled, new ones would fail in their place. It was impossible to triage so I thought separating the tests from their respective manifests would make it easier to find common sources of trouble. I think the main reason it took so long on my laptop was the large number of failures causing tons of data urls to be saved in the logs.

This seems to be better on the desktop (or maybe some fixes landed in b2g), I'll know more by tomorrow.

> Let's not start disabling until we triage the failures and decide how much
> we're going to support the desktop-linux-emulator setup.  If the pandaboard
> is a long way off, then we'll probably need to support these to some degree.
> 
> I'll help with this.

Sounds good! Thanks :)

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 10

•

12 years ago

(In reply to Andrew Halberstadt [:ahal] from comment #9)
> (In reply to Chris Jones [:cjones] [:warhammer] from comment #8)
> > What desktop are you running the emulator on?  If it's desktop-linux-NVIDIA,
> > I would expect many failures because we don't test rendering at all on that
> > gpu, and additional errors can creep in from the emulator translation layer.
> 
> Yes, my laptop and the desktop are both Linux with Nvidia GT218 and Nvidia
> GT216 cards respectively. I can order a new card if it becomes a big problem.
> 

Yeah, we don't really support GLX well.  It's a big barrel of monkeys.

> > That should not be necessary at all, and is probably why the tests take 4
> > hours to run ;).
> 
> It shouldn't be necessary, but on my laptop at least the order of tests that
> got run had a dramatic affect on which ones passed and failed.

That's an unnecessary workaround.  How about we start with one directory of tests, layout/reftest/reftest-sanity, see if we get reliable results, triage the failures / intermittent-ness, get them reliably passing, and then move outwards? :)

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 11

•

12 years ago

Hi Andrew, do you have any initial results for reftest-sanity?  Are the results consistent or inconsistent?

/me waiting on pins and needles ... ;)

Andrew Halberstadt [:ahal]

Reporter

Comment 12

•

12 years ago

Attached file reftest_sanity log (deleted) — Details

Hey Chris, here are the results for reftest-sanity, and they are indeed consistent :)

I actually have the results for all the other manifests as well (see them here: http://people.mozilla.org/~ahalberstadt/reftest/260712_reftest_logs.zip)

There are around 400 failing tests, over 200 of which are in ../../image/test/reftest.list. These 200 seem to fail because the image is being rendered in the centre of the screen instead of the top-left corner.

Of the remaining 200 failures, a fair amount are exceptions (presumably API's that B2G doesn't have are being called.. though I haven't looked into it).

Andrew Halberstadt [:ahal]

Reporter

Comment 13

•

12 years ago

I disabled the 400 failing tests and am currently running all tests at once from the root manifest (in 3 chunks). I'll let you know how that goes tomorrow morning (I think reftests are just crazy slow on the emulator in general).

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 14

•

12 years ago

Andrew, you can add the instructions for running these reftests to https://wiki.mozilla.org/B2G/Hacking#Reftests ?  I'd like to give them a spin locally and review that list.

Andrew Halberstadt [:ahal]

Reporter

Comment 15

•

12 years ago

Updated the docs: https://wiki.mozilla.org/B2G/Hacking#Reftests
Let me know if you run into trouble.

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Updated

•

12 years ago

Depends on: 778072

Andrew Halberstadt [:ahal]

Reporter

Updated

•

12 years ago

Depends on: 778725

Andrew Halberstadt [:ahal]

Reporter

Updated

•

12 years ago

Blocks: mobile-automation

Andrew Halberstadt [:ahal]

Reporter

Comment 16

•

12 years ago

Reftests are being run against b2g nightlies and are reported here: http://brasstacks.mozilla.com/autolog/?tree=b2g&source=autolog

Let me know if you want me to enable/disable any of the tests. For now, it is the same subset that fennec runs.

Andrew Halberstadt [:ahal]

Reporter

Updated

•

12 years ago

Depends on: 782655

Andrew Halberstadt [:ahal]

Reporter

Updated

•

12 years ago

Depends on: 783621

Andrew Halberstadt [:ahal]

Reporter

Updated

•

12 years ago

Depends on: 783632

Andrew Halberstadt [:ahal]

Reporter

Updated

•

12 years ago

Depends on: 783658

Andrew Halberstadt [:ahal]

Reporter

Updated

•

12 years ago

Depends on: 737961

Andrew Halberstadt [:ahal]

Reporter

Updated

•

12 years ago

Depends on: 784810

Andrew Halberstadt [:ahal]

Reporter

Updated

•

12 years ago

Depends on: 785074

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Updated

•

12 years ago

Depends on: 780902

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Updated

•

12 years ago

Depends on: 780920
No longer depends on: 780902

Andrew Halberstadt [:ahal]

Reporter

Updated

•

12 years ago

Depends on: 807970

Andrew Halberstadt [:ahal]

Reporter

Comment 17

•

12 years ago

Reftests are turned on in Cedar: https://tbpl.mozilla.org/?tree=Cedar

I'll have a patch to get the full set of passing ones enabled soon.

Andrew Halberstadt [:ahal]

Reporter

Updated

•

12 years ago

Depends on: 808771

Andrew Halberstadt [:ahal]

Reporter

Updated

•

12 years ago

Depends on: 810401

Andrew Halberstadt [:ahal]

Reporter

Updated

•

12 years ago

Depends on: 811779

Andrew Halberstadt [:ahal]

Reporter

Updated

•

12 years ago

Depends on: 818103

Andrew Halberstadt [:ahal]

Reporter

Updated

•

12 years ago

Depends on: 829626

Andrew Halberstadt [:ahal]

Reporter

Updated

•

12 years ago

Alias: b2g-reftest

Andrew Halberstadt [:ahal]

Reporter

Updated

•

12 years ago

Depends on: b2g-crashtest

Ed Morley [:emorley]

Updated

•

12 years ago

Depends on: 839735

Andrew Halberstadt [:ahal]

Reporter

Updated

•

12 years ago

Depends on: 843634

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Updated

•

12 years ago

Blocks: b2g-testing

Ryan VanderMeulen [:RyanVM]

Updated

•

11 years ago

Depends on: 853024

Andrew Halberstadt [:ahal]

Reporter

Updated

•

11 years ago

Depends on: 861186

Ryan VanderMeulen [:RyanVM]

Updated

•

11 years ago

Depends on: 861928

Andrew Halberstadt [:ahal]

Reporter

Updated

•

11 years ago

Depends on: 862787

Joe Drew (not getting mail)

Updated

•

11 years ago

Depends on: 869011

Ed Morley [:emorley]

Updated

•

11 years ago

Depends on: 870757

Jonathan Griffin (:jgriffin)

Updated

•

11 years ago

Depends on: 876801

Andrew Halberstadt [:ahal]

Reporter

Updated

•

11 years ago

Depends on: 922680

Andrew Halberstadt [:ahal]

Reporter

Updated

•

11 years ago

Depends on: 958518

Daniel Holbert [:dholbert]

Updated

•

11 years ago

Blocks: 981110

David Baron :dbaron:

Updated

•

10 years ago

Depends on: 986409

Andrew Halberstadt [:ahal]

Reporter

Updated

•

10 years ago

Depends on: B2GRT

Daniel Holbert [:dholbert]

Updated

•

10 years ago

Depends on: 1084564

Seth Fowler [:seth] [:s2h]

Updated

•

10 years ago

Depends on: 1091229

Daniel Holbert [:dholbert]

Comment 18

•

8 years ago

This is WONTFIX (i.e. we're not going to invest in enabling/fixing reftests on B2G), now that we're removing B2G code from the tree via bug 1306391.

Status: NEW → RESOLVED

Closed: 8 years ago

Resolution: --- → WONTFIX

Daniel Holbert [:dholbert]

Comment 19

•

8 years ago

(side note: we have lots of references to this tracking bug in reftest.list; many [maybe all?] of those are being removed in bug 1307332.)

list of reftests that fail on B2g 12 years ago Andrew Halberstadt [:ahal] (deleted), text/plain		Details
reftest_sanity log 12 years ago Andrew Halberstadt [:ahal] (deleted), text/plain		Details