FSK filerw and its component UUTs failing on Zynq UltraScale+

DB
David Banks
Thu, Jan 10, 2019 7:46 PM

Hi,

So far, I have ptest, tb_bias_v2 and testbias working on the zcu111/zcu102
Zynq UltraScale devices. I also have verified some custom apps against zed
(pattern_v2->file_write, file_read->capture_v2, and data_src->file_write).
However, FSK filerw is failing.

Here is a visual diff of the hexdump between zcu111 and zed for the
fsk_filerw output:
[image: image.png]

And here are the plotted outputs (zed on top, zcu111 below):
[image: image.png]

I decided to drill down to the individual component unit tests. I am still
building some of them, but here are my results so far:
PASSING: fir_real/complex_sse, complex_mixer, phase_to_amp_cordic, and all
RCC uuts
FAILING: dc_offset_filter, iq_imbalance_fixer, pr_cordic, and rp_cordic

And here are plots of the results for dc_offset_filter and
iq_imbalance_fixer on both zed and zcu111 (inputs on top, outputs below in
each image):
dc_offset_filter on zed:
[image: image.png]

vs dc_offset_filter on zcu111:
[image: image.png]

Here, I zoom into the time plot on zcu111:
[image: image.png]

And then for iq_imbalance on zed:
[image: image.png]

and iq_imbalance_fixer on zcu111:
[image: image.png]

I have also examined the worker, assembly and container synthesis logs as
well as the implementation logs for these tests and compared them to zed. I
saw nothing worrisome there.

I am using python version 2.7.5 to plot all of these results.

From the hexdump as well as these plots, it looks like we are getting some
bits erroneously flipped to 0 in these tests and in FSK filerw.

Any ideas for debugging this issue? I am branched off of release_1.4.

Thanks!
David Banks
dbanks@geontech.com
Geon Technologies, LLC

Hi, So far, I have ptest, tb_bias_v2 and testbias working on the zcu111/zcu102 Zynq UltraScale devices. I also have verified some custom apps against zed (pattern_v2->file_write, file_read->capture_v2, and data_src->file_write). However, FSK filerw is failing. Here is a visual diff of the hexdump between zcu111 and zed for the fsk_filerw output: [image: image.png] And here are the plotted outputs (zed on top, zcu111 below): [image: image.png] I decided to drill down to the individual component unit tests. I am still building some of them, but here are my results so far: PASSING: fir_real/complex_sse, complex_mixer, phase_to_amp_cordic, and all RCC uuts FAILING: dc_offset_filter, iq_imbalance_fixer, pr_cordic, and rp_cordic And here are plots of the results for dc_offset_filter and iq_imbalance_fixer on both zed and zcu111 (inputs on top, outputs below in each image): *dc_offset_filter on zed:* [image: image.png] *vs dc_offset_filter on zcu111:* [image: image.png] *Here, I zoom into the time plot on zcu111:* [image: image.png] *And then for iq_imbalance on zed:* [image: image.png] *and iq_imbalance_fixer on zcu111:* [image: image.png] I have also examined the worker, assembly and container synthesis logs as well as the implementation logs for these tests and compared them to zed. I saw nothing worrisome there. I am using python version 2.7.5 to plot all of these results. From the hexdump as well as these plots, it looks like we are getting some bits erroneously flipped to 0 in these tests and in FSK filerw. Any ideas for debugging this issue? I am branched off of release_1.4. Thanks! David Banks dbanks@geontech.com Geon Technologies, LLC
DB
David Banks
Fri, Jan 11, 2019 9:47 PM

I moved to a simpler application to debug this. I have testbias with
biasValue=0x00000001 and input file set to /dev/zero. So, every output
sample should be 0x1. Instead, I see:
0000000 0001 0000 0001 0000 0001 0000 0001 0000
*
0000040 0000 0000 0000 0000 0000 0000 0000 0000
*
0000080 0001 0000 0001 0000 0001 0000 0001 0000
*
0001050 0000 0000 0000 0000 0000 0000 0000 0000
*
0001080 0001 0000 0001 0000 0001 0000 0001 0000
*
0002040 0000 0000 0000 0000 0000 0000 0000 0000
*
0002080 0001 0000 0001 0000 0001 0000 0001 0000
*
0003050 0000 0000 0000 0000 0000 0000 0000 0000
*
0003080 0001 0000 0001 0000 0001 0000 0001 0000
*
0004040 0000 0000 0000 0000 0000 0000 0000 0000
*
0004080 0001 0000 0001 0000 0001 0000 0001 0000
*
0005050 0000 0000 0000 0000 0000 0000 0000 0000
*
0005080 0001 0000 0001 0000 0001 0000 0001 0000
*
0006040 0000 0000 0000 0000 0000 0000 0000 0000
*
0006080 0001 0000 0001 0000 0001 0000 0001 0000
*
0007050 0000 0000 0000 0000 0000 0000 0000 0000
*
0007080 0001 0000 0001 0000 0001 0000 0001 0000
*
0008040 0000 0000 0000 0000 0000 0000 0000 0000
*
0008080 0001 0000 0001 0000 0001 0000 0001 0000
*

So, it looks like the glitches to 0 are occurring every ~0x1000 bytes and
last for 0x30 or 0x40 bytes.

-David
David Banks
dbanks@geontech.com
Geon Technologies, LLC

I moved to a simpler application to debug this. I have testbias with biasValue=0x00000001 and input file set to /dev/zero. So, every output sample should be 0x1. Instead, I see: 0000000 0001 0000 0001 0000 0001 0000 0001 0000 * *0000040 0000 0000 0000 0000 0000 0000 0000 0000* * 0000080 0001 0000 0001 0000 0001 0000 0001 0000 * *0001050 0000 0000 0000 0000 0000 0000 0000 0000* * 0001080 0001 0000 0001 0000 0001 0000 0001 0000 * *0002040 0000 0000 0000 0000 0000 0000 0000 0000* * 0002080 0001 0000 0001 0000 0001 0000 0001 0000 * *0003050 0000 0000 0000 0000 0000 0000 0000 0000* * 0003080 0001 0000 0001 0000 0001 0000 0001 0000 * *0004040 0000 0000 0000 0000 0000 0000 0000 0000* * 0004080 0001 0000 0001 0000 0001 0000 0001 0000 * *0005050 0000 0000 0000 0000 0000 0000 0000 0000* * 0005080 0001 0000 0001 0000 0001 0000 0001 0000 * *0006040 0000 0000 0000 0000 0000 0000 0000 0000* * 0006080 0001 0000 0001 0000 0001 0000 0001 0000 * *0007050 0000 0000 0000 0000 0000 0000 0000 0000* * 0007080 0001 0000 0001 0000 0001 0000 0001 0000 * *0008040 0000 0000 0000 0000 0000 0000 0000 0000* * 0008080 0001 0000 0001 0000 0001 0000 0001 0000 * So, it looks like the glitches to 0 are occurring every ~0x1000 bytes and last for 0x30 or 0x40 bytes. -David David Banks dbanks@geontech.com Geon Technologies, LLC
DB
David Banks
Thu, Jan 17, 2019 8:25 PM

After quite a bit of debugging and some great help from Jim, I determined
that there were AXI data bursts occurring that crossed 4KB address
boundaries. This is something forbidden in the AXI spec that never
materialized for the Zynq hardware, but is proving to be a requirement
ZynqMP. By changing runtime/util/property/include/OcpiUtilPort.h's
BUFFER_ALIGNMENT constant from 16 to 128( (as per Jim's recommendation), I
was able to get this app as well as fsk (filerw mode) and the unit tests
working.

On Fri, Jan 11, 2019 at 4:47 PM David Banks dbanks@geontech.com wrote:

I moved to a simpler application to debug this. I have testbias with
biasValue=0x00000001 and input file set to /dev/zero. So, every output
sample should be 0x1. Instead, I see:
0000000 0001 0000 0001 0000 0001 0000 0001 0000
*
0000040 0000 0000 0000 0000 0000 0000 0000 0000
*
0000080 0001 0000 0001 0000 0001 0000 0001 0000
*
0001050 0000 0000 0000 0000 0000 0000 0000 0000
*
0001080 0001 0000 0001 0000 0001 0000 0001 0000
*
0002040 0000 0000 0000 0000 0000 0000 0000 0000
*
0002080 0001 0000 0001 0000 0001 0000 0001 0000
*
0003050 0000 0000 0000 0000 0000 0000 0000 0000
*
0003080 0001 0000 0001 0000 0001 0000 0001 0000
*
0004040 0000 0000 0000 0000 0000 0000 0000 0000
*
0004080 0001 0000 0001 0000 0001 0000 0001 0000
*
0005050 0000 0000 0000 0000 0000 0000 0000 0000
*
0005080 0001 0000 0001 0000 0001 0000 0001 0000
*
0006040 0000 0000 0000 0000 0000 0000 0000 0000
*
0006080 0001 0000 0001 0000 0001 0000 0001 0000
*
0007050 0000 0000 0000 0000 0000 0000 0000 0000
*
0007080 0001 0000 0001 0000 0001 0000 0001 0000
*
0008040 0000 0000 0000 0000 0000 0000 0000 0000
*
0008080 0001 0000 0001 0000 0001 0000 0001 0000
*

So, it looks like the glitches to 0 are occurring every ~0x1000 bytes and
last for 0x30 or 0x40 bytes.

-David
David Banks
dbanks@geontech.com
Geon Technologies, LLC

--
David Banks
dbanks@geontech.com
Geon Technologies, LLC

After quite a bit of debugging and some great help from Jim, I determined that there were AXI data bursts occurring that crossed 4KB address boundaries. This is something forbidden in the AXI spec that never materialized for the Zynq hardware, but is proving to be a requirement ZynqMP. By changing runtime/util/property/include/OcpiUtilPort.h's BUFFER_ALIGNMENT constant from 16 to 128( (as per Jim's recommendation), I was able to get this app as well as fsk (filerw mode) and the unit tests working. On Fri, Jan 11, 2019 at 4:47 PM David Banks <dbanks@geontech.com> wrote: > I moved to a simpler application to debug this. I have testbias with > biasValue=0x00000001 and input file set to /dev/zero. So, every output > sample should be 0x1. Instead, I see: > 0000000 0001 0000 0001 0000 0001 0000 0001 0000 > * > *0000040 0000 0000 0000 0000 0000 0000 0000 0000* > * > 0000080 0001 0000 0001 0000 0001 0000 0001 0000 > * > *0001050 0000 0000 0000 0000 0000 0000 0000 0000* > * > 0001080 0001 0000 0001 0000 0001 0000 0001 0000 > * > *0002040 0000 0000 0000 0000 0000 0000 0000 0000* > * > 0002080 0001 0000 0001 0000 0001 0000 0001 0000 > * > *0003050 0000 0000 0000 0000 0000 0000 0000 0000* > * > 0003080 0001 0000 0001 0000 0001 0000 0001 0000 > * > *0004040 0000 0000 0000 0000 0000 0000 0000 0000* > * > 0004080 0001 0000 0001 0000 0001 0000 0001 0000 > * > *0005050 0000 0000 0000 0000 0000 0000 0000 0000* > * > 0005080 0001 0000 0001 0000 0001 0000 0001 0000 > * > *0006040 0000 0000 0000 0000 0000 0000 0000 0000* > * > 0006080 0001 0000 0001 0000 0001 0000 0001 0000 > * > *0007050 0000 0000 0000 0000 0000 0000 0000 0000* > * > 0007080 0001 0000 0001 0000 0001 0000 0001 0000 > * > *0008040 0000 0000 0000 0000 0000 0000 0000 0000* > * > 0008080 0001 0000 0001 0000 0001 0000 0001 0000 > * > > So, it looks like the glitches to 0 are occurring every ~0x1000 bytes and > last for 0x30 or 0x40 bytes. > > -David > David Banks > dbanks@geontech.com > Geon Technologies, LLC > -- David Banks dbanks@geontech.com Geon Technologies, LLC