I am currently bench-marking the zcu104 platform and have setup the zcu104 following the Installation Guide.
Setup ZynqReleases / git directory
built / installed /deployed the zcu104 and xilinx19_2_aarch64 platform
I have been able to successfully run pattern_capture on the zcu104 in standalone, network, and server mode.
However, I am running into issues when I run testbias on the zcu104. Running testbias on the zcu104 (in Standalone or Network Modes) produces the test.output file with the correct number of bytes, but all of the byte values are zero, as observed using hexdump. Also, When bench marking the zedboard I was able to produce an identical md5sum of the test.input and test.output when applying a bias value of 0, this is not the case for the zcu104.
I have also noticed that I get the following error when I run testbias on the zcu104 in server mode. This error does not occur in network or standalone mode:
Property 40: file_write.countData = "false"
Property 41: file_write.bytesPerSecond = "0x0"
OCPI( 2:793.0813): Exception during application shutdown: error reading from container server "": EOF on socket read
Exiting for exception: error reading from container server "": EOF on socket read
Both the md5sum and hexdump verification can be seen in the attached text file as well as the standard out of testbias. Any help in this endeavor would be appreciated.
Thanks,
Joel Palmer
This was tested on version v2.1.0
One small point:
In server mode, the error:
error r****eading from container server "": EOF on socket read
means the server failed, and the server log should be examined to find out why.
On 5/5/21 1:34 PM, jpalmer@geontech.com wrote:
I am currently bench-marking the zcu104 platform and have setup the zcu104 following the Installation Guide.
- Setup ZynqReleases / git directory
- built / installed /deployed the zcu104 and xilinx19_2_aarch64 platform
I have been able to successfully run pattern_capture on the zcu104 in standalone, network, and server mode.
However, I am running into issues when I run testbias on the zcu104. Running testbias on the zcu104 (in Standalone or Network Modes) produces the test.output file with the correct number of bytes, but all of the byte values are zero, as observed using hexdump. Also, When bench marking the zedboard I was able to produce an identical md5sum of the test.input and test.output when applying a bias value of 0, this is not the case for the zcu104.
I have also noticed that I get the following error when I run testbias on the zcu104 in server mode. This error does not occur in network or standalone mode:
Property 40: file_write.countData = "false"
Property 41: file_write.bytesPerSecond = "0x0"
OCPI( 2:793.0813): Exception during application shutdown: error reading from container server "": EOF on socket read
Exiting for exception: error reading from container server "": EOF on socket read
Both the md5sum and hexdump verification can be seen in the attached text file as well as the standard out of testbias. Any help in this endeavor would be appreciated.
Thanks,
Joel Palmer
<pre class="moz-quote-pre" wrap="">_______________________________________________ discuss mailing list -- <a class="moz-txt-link-abbreviated" href="mailto:discuss@lists.opencpi.org">discuss@lists.opencpi.org</a> To unsubscribe send an email to <a class="moz-txt-link-abbreviated" href="mailto:discuss-leave@lists.opencpi.org">discuss-leave@lists.opencpi.org</a>
Look for the server log in the sandbox directory of the zcu104.
From: James Kulp jek@parera.com
Sent: Wednesday, May 5, 2021 2:01 PM
To: discuss@lists.opencpi.org discuss@lists.opencpi.org
Subject: [Discuss OpenCPI] Re: zcu104 testbias issues
One small point:
In server mode, the error:
error reading from container server "": EOF on socket read
means the server failed, and the server log should be examined to find out why.
On 5/5/21 1:34 PM, jpalmer@geontech.commailto:jpalmer@geontech.com wrote:
I am currently bench-marking the zcu104 platform and have setup the zcu104 following the Installation Guide.
Setup ZynqReleases / git directory
built / installed /deployed the zcu104 and xilinx19_2_aarch64 platform
I have been able to successfully run pattern_capture on the zcu104 in standalone, network, and server mode.
However, I am running into issues when I run testbias on the zcu104. Running testbias on the zcu104 (in Standalone or Network Modes) produces the test.output file with the correct number of bytes, but all of the byte values are zero, as observed using hexdump. Also, When bench marking the zedboard I was able to produce an identical md5sum of the test.input and test.output when applying a bias value of 0, this is not the case for the zcu104.
I have also noticed that I get the following error when I run testbias on the zcu104 in server mode. This error does not occur in network or standalone mode:
Property 40: file_write.countData = "false"
Property 41: file_write.bytesPerSecond = "0x0"
OCPI( 2:793.0813): Exception during application shutdown: error reading from container server "": EOF on socket read
Exiting for exception: error reading from container server "": EOF on socket read
Both the md5sum and hexdump verification can be seen in the attached text file as well as the standard out of testbias. Any help in this endeavor would be appreciated.
Thanks,
Joel Palmer
discuss mailing list -- discuss@lists.opencpi.orgmailto:discuss@lists.opencpi.org
To unsubscribe send an email to discuss-leave@lists.opencpi.orgmailto:discuss-leave@lists.opencpi.org
After performing the steps again on the zcu104 the following log file was generated in the sandbox directory.
Discovery options: discoverable: 0, loopback: 0, onlyloopback: 0
Container server at <ANY>:12345
Available TCP server addresses are:
On interface eth0: 192.168.0.10:12345
Artifacts stored/cached in the directory "artifacts", which will be retained on exit.
Containers offered to clients are:
0: PL:0, model: hdl, os: , osVersion: , platform: zcu104
1: rcc0, model: rcc, os: linux, osVersion: 19_2, platform: xilinx19_2_aarch64
New client is "192.168.0.100:51782".
Shutting down client "192.168.0.100:51782" due to error: Code 0x17, level 0, error: 'Worker "file_read" produced an error during the "start" control operation: error opening file
Its also worth noting that this was presented once [$ ocpiremote start -b] was performed.
root@xilinx-zcu104-2019_2:~# [ 72.971915] opencpi: dma_set_coherent_mask failed for device ffffffc027f77400
[ 72.979120] opencpi: get_dma_memory failed in opencpi_init, trying fallback
[ 72.986108] NET: Registered protocol family 12
I have performed the exact set of commands and flow on the Zedboard and have been able to successfully run testbias. Can you confirm that testbias has worked on the zcu104 during regression testing of version v2.1.0 in any of the three modes (standalone, network, server)?
Thanks,
Joel Palmer
The file_read is being run on the embedded device which doesn't have access
to the input file. If running in remote mode (ocpiremote) try forcing
file_read and file_write to run on your host system.
ocpirun -v -d -x -m bias=hdl -p bias=biasvalue=0 -P file_read=ubuntu18_04
-P file_write=ubuntu18_04 testbias.xml
On Wed, May 5, 2021 at 3:27 PM jpalmer@geontech.com wrote:
After performing the steps again on the zcu104 the following log file was
generated in the sandbox directory.
Discovery options: discoverable: 0, loopback: 0, onlyloopback: 0
Container server at <ANY>:12345
Available TCP server addresses are:
On interface eth0: 192.168.0.10:12345
Artifacts stored/cached in the directory "artifacts", which will be
retained on exit.
Containers offered to clients are:
0: PL:0, model: hdl, os: , osVersion: , platform: zcu104
1: rcc0, model: rcc, os: linux, osVersion: 19_2, platform:
xilinx19_2_aarch64
New client is "192.168.0.100:51782".
Shutting down client "192.168.0.100:51782" due to error: Code 0x17, level
0, error: 'Worker "file_read" produced an error during the "start" control
operation: error opening file
Its also worth noting that this was presented once [$ ocpiremote start -b]
was performed.
root@xilinx-zcu104-2019_2:~# [ 72.971915] opencpi: dma_set_coherent_mask
failed for device ffffffc027f77400
[ 72.979120] opencpi: get_dma_memory failed in opencpi_init, trying
fallback
[ 72.986108] NET: Registered protocol family 12
I have performed the exact set of commands and flow on the Zedboard and
have been able to successfully run testbias. Can you confirm that testbias
has worked on the zcu104 during regression testing of version v2.1.0 in any
of the three modes (standalone, network, server)?
Thanks,
Joel Palmer
discuss mailing list -- discuss@lists.opencpi.org
To unsubscribe send an email to discuss-leave@lists.opencpi.org
Unfortunately that did not seem to produce a different outcome. I’ll reattach the standard out and md5sum / hexdump. In general the issue still seems that testbias is not working for the zcu104 in either of the three modes of operation (standalone, network, server). Have you guys had any success running testbias on the zcu104 using v2.1.0?
Thanks,
Joel Palmer
From: Aaron Olivarez aaron@olivarez.info
Sent: Wednesday, May 5, 2021 5:08 PM
To: Joel Palmer jpalmer@geontech.com
Cc: discuss@lists.opencpi.org discuss@lists.opencpi.org
Subject: Re: [Discuss OpenCPI] Re: zcu104 testbias issues
The file_read is being run on the embedded device which doesn't have access to the input file. If running in remote mode (ocpiremote) try forcing file_read and file_write to run on your host system.
ocpirun -v -d -x -m bias=hdl -p bias=biasvalue=0 -P file_read=ubuntu18_04 -P file_write=ubuntu18_04 testbias.xml
On Wed, May 5, 2021 at 3:27 PM <jpalmer@geontech.commailto:jpalmer@geontech.com> wrote:
After performing the steps again on the zcu104 the following log file was generated in the sandbox directory.
Discovery options: discoverable: 0, loopback: 0, onlyloopback: 0
Container server at <ANY>:12345
Available TCP server addresses are:
Artifacts stored/cached in the directory "artifacts", which will be retained on exit.
Containers offered to clients are:
0: PL:0, model: hdl, os: , osVersion: , platform: zcu104
1: rcc0, model: rcc, os: linux, osVersion: 19_2, platform: xilinx19_2_aarch64
Shutting down client "192.168.0.100:51782https://usg02.safelinks.protection.office365.us/?url=http%3A%2F%2F192.168.0.100%3A51782%2F&data=04%7C01%7Cjpalmer%40geontech.com%7C64d23e68a874495837d008d91009fc0c%7Cf30278c2f29d431db34e2efaf88f266c%7C1%7C1%7C637558457357229344%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=Fnwua87E1kOTUDeWbxkDfHGqIrMNheLGhH5P5WmGFjA%3D&reserved=0" due to error: Code 0x17, level 0, error: 'Worker "file_read" produced an error during the "start" control operation: error opening file
Its also worth noting that this was presented once [$ ocpiremote start -b] was performed.
root@xilinx-zcu104-2019_2:~# [ 72.971915] opencpi: dma_set_coherent_mask failed for device ffffffc027f77400
[ 72.979120] opencpi: get_dma_memory failed in opencpi_init, trying fallback
[ 72.986108] NET: Registered protocol family 12
I have performed the exact set of commands and flow on the Zedboard and have been able to successfully run testbias. Can you confirm that testbias has worked on the zcu104 during regression testing of version v2.1.0 in any of the three modes (standalone, network, server)?
Thanks,
Joel Palmer
discuss mailing list -- discuss@lists.opencpi.orgmailto:discuss@lists.opencpi.org
To unsubscribe send an email to discuss-leave@lists.opencpi.orgmailto:discuss-leave@lists.opencpi.org
Joel,
So, if I'm understanding correctly, it's now sending data, but the data written to file is not what you expected? We'll have someone take a look at it.
Jerry
From: Joel Palmer jpalmer@geontech.com
Sent: Thursday, May 6, 2021 3:52 PM
To: Aaron Olivarez aaron@olivarez.info
Cc: discuss@lists.opencpi.org discuss@lists.opencpi.org
Subject: [Discuss OpenCPI] Re: zcu104 testbias issues
Unfortunately that did not seem to produce a different outcome. I’ll reattach the standard out and md5sum / hexdump. In general the issue still seems that testbias is not working for the zcu104 in either of the three modes of operation (standalone, network, server). Have you guys had any success running testbias on the zcu104 using v2.1.0?
Thanks,
Joel Palmer
From: Aaron Olivarez aaron@olivarez.info
Sent: Wednesday, May 5, 2021 5:08 PM
To: Joel Palmer jpalmer@geontech.com
Cc: discuss@lists.opencpi.org discuss@lists.opencpi.org
Subject: Re: [Discuss OpenCPI] Re: zcu104 testbias issues
The file_read is being run on the embedded device which doesn't have access to the input file. If running in remote mode (ocpiremote) try forcing file_read and file_write to run on your host system.
ocpirun -v -d -x -m bias=hdl -p bias=biasvalue=0 -P file_read=ubuntu18_04 -P file_write=ubuntu18_04 testbias.xml
On Wed, May 5, 2021 at 3:27 PM <jpalmer@geontech.commailto:jpalmer@geontech.com> wrote:
After performing the steps again on the zcu104 the following log file was generated in the sandbox directory.
Discovery options: discoverable: 0, loopback: 0, onlyloopback: 0
Container server at <ANY>:12345
Available TCP server addresses are:
Artifacts stored/cached in the directory "artifacts", which will be retained on exit.
Containers offered to clients are:
0: PL:0, model: hdl, os: , osVersion: , platform: zcu104
1: rcc0, model: rcc, os: linux, osVersion: 19_2, platform: xilinx19_2_aarch64
Shutting down client "192.168.0.100:51782https://usg02.safelinks.protection.office365.us/?url=http%3A%2F%2F192.168.0.100%3A51782%2F&data=04%7C01%7Cjpalmer%40geontech.com%7C64d23e68a874495837d008d91009fc0c%7Cf30278c2f29d431db34e2efaf88f266c%7C1%7C1%7C637558457357229344%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=Fnwua87E1kOTUDeWbxkDfHGqIrMNheLGhH5P5WmGFjA%3D&reserved=0" due to error: Code 0x17, level 0, error: 'Worker "file_read" produced an error during the "start" control operation: error opening file
Its also worth noting that this was presented once [$ ocpiremote start -b] was performed.
root@xilinx-zcu104-2019_2:~# [ 72.971915] opencpi: dma_set_coherent_mask failed for device ffffffc027f77400
[ 72.979120] opencpi: get_dma_memory failed in opencpi_init, trying fallback
[ 72.986108] NET: Registered protocol family 12
I have performed the exact set of commands and flow on the Zedboard and have been able to successfully run testbias. Can you confirm that testbias has worked on the zcu104 during regression testing of version v2.1.0 in any of the three modes (standalone, network, server)?
Thanks,
Joel Palmer
discuss mailing list -- discuss@lists.opencpi.orgmailto:discuss@lists.opencpi.org
To unsubscribe send an email to discuss-leave@lists.opencpi.orgmailto:discuss-leave@lists.opencpi.org
Joel,
I was able to reproduce the same results on our end. We will be
investigating this issue and get back to you as soon as possible.
Aaron
On Fri, May 7, 2021 at 8:01 AM Jerry Darko jerry.darko@cnftech.com wrote:
Joel,
So, if I'm understanding correctly, it's now sending data, but the data
written to file is not what you expected? We'll have someone take a look at
it.
From: Joel Palmer jpalmer@geontech.com
Sent: Thursday, May 6, 2021 3:52 PM
To: Aaron Olivarez aaron@olivarez.info
Cc: discuss@lists.opencpi.org discuss@lists.opencpi.org
Subject: [Discuss OpenCPI] Re: zcu104 testbias issues
Unfortunately that did not seem to produce a different outcome. I’ll
reattach the standard out and md5sum / hexdump. In general the issue still
seems that testbias is not working for the zcu104 in either of the three
modes of operation (standalone, network, server). Have you guys had any
success running testbias on the zcu104 using v2.1.0?
Thanks,
Joel Palmer
From: Aaron Olivarez aaron@olivarez.info
Sent: Wednesday, May 5, 2021 5:08 PM
To: Joel Palmer jpalmer@geontech.com
Cc: discuss@lists.opencpi.org discuss@lists.opencpi.org
Subject: Re: [Discuss OpenCPI] Re: zcu104 testbias issues
The file_read is being run on the embedded device which doesn't have
access to the input file. If running in remote mode (ocpiremote) try
forcing file_read and file_write to run on your host system.
ocpirun -v -d -x -m bias=hdl -p bias=biasvalue=0 -P file_read=ubuntu18_04
-P file_write=ubuntu18_04 testbias.xml
On Wed, May 5, 2021 at 3:27 PM jpalmer@geontech.com wrote:
After performing the steps again on the zcu104 the following log file was
generated in the sandbox directory.
Discovery options: discoverable: 0, loopback: 0, onlyloopback: 0
Container server at <ANY>:12345
Available TCP server addresses are:
Artifacts stored/cached in the directory "artifacts", which will be
retained on exit.
Containers offered to clients are:
0: PL:0, model: hdl, os: , osVersion: , platform: zcu104
1: rcc0, model: rcc, os: linux, osVersion: 19_2, platform:
xilinx19_2_aarch64
Shutting down client "192.168.0.100:51782
https://usg02.safelinks.protection.office365.us/?url=http%3A%2F%2F192.168.0.100%3A51782%2F&data=04%7C01%7Cjpalmer%40geontech.com%7C64d23e68a874495837d008d91009fc0c%7Cf30278c2f29d431db34e2efaf88f266c%7C1%7C1%7C637558457357229344%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=Fnwua87E1kOTUDeWbxkDfHGqIrMNheLGhH5P5WmGFjA%3D&reserved=0"
due to error: Code 0x17, level 0, error: 'Worker "file_read" produced an
error during the "start" control operation: error opening file
Its also worth noting that this was presented once [$ ocpiremote start -b]
was performed.
root@xilinx-zcu104-2019_2:~# [ 72.971915] opencpi: dma_set_coherent_mask
failed for device ffffffc027f77400
[ 72.979120] opencpi: get_dma_memory failed in opencpi_init, trying
fallback
[ 72.986108] NET: Registered protocol family 12
I have performed the exact set of commands and flow on the Zedboard and
have been able to successfully run testbias. Can you confirm that testbias
has worked on the zcu104 during regression testing of version v2.1.0 in any
of the three modes (standalone, network, server)?
Thanks,
Joel Palmer
discuss mailing list -- discuss@lists.opencpi.org
To unsubscribe send an email to discuss-leave@lists.opencpi.org
While we have not gotten to the root cause of the zynq-ultrascale failures on v2.1.0, we have a workaround for the time being.
In 2.1, several performance improvements were introduced in the DMA driver to significantly improve throughput when streaming between ARM CPU and FPGA on zynq and other systems.
The DMA improvements were thought to apply to the zynq-ultrascale as well as zynq, or at least do no harm to zynq-ultrascale while improving zynq (7000).
Unfortunately, the release testing process was inadvertently and silently not testing zynq-ultrascale properly, and the DMA improvements broke zynq-ultrascale in 2.1.
Happily, the DMA improvements are controlled by an environment variable, OCPI_DMA_CACHE_MODE.
By setting OCPI_DMA_CACHE_MODE=0 the DMA caching improvements are disabled, and at least the zcu104 system appears to work as before.
Note that in "server mode" this can be accomplished by using the "-e" option when using ocpiremote to start the server i.e.:
ocpiremote start -b -e OCPI_DMA_CACHE_MODE=0
This is just a quick report based on our investigations so far, for those working with zynq-ultrascale with OpenCPI 2.1.
Jim
On 5/10/21 11:21 AM, Aaron Olivarez wrote:
Joel,
I was able to reproduce the same results on our end. We will be investigating this issue and get back to you as soon as possible.
Aaron
On Fri, May 7, 2021 at 8:01 AM Jerry Darko <jerry.darko@cnftech.com> wrote:
Joel,
So, if I'm understanding correctly, it's now sending data, but the data written to file is not what you expected? We'll have someone take a look at it.
Jerry
From: Joel Palmer <jpalmer@geontech.com>
Sent: Thursday, May 6, 2021 3:52 PM
To: Aaron Olivarez <aaron@olivarez.info>
Cc: discuss@lists.opencpi.org <discuss@lists.opencpi.org>
Subject: [Discuss OpenCPI] Re: zcu104 testbias issuesUnfortunately that did not seem to produce a different outcome. I’ll reattach the standard out and md5sum / hexdump. In general the issue still seems that testbias is not working for the zcu104 in either of the three modes of operation (standalone, network, server). Have you guys had any success running testbias on the zcu104 using v2.1.0?
Thanks,
Joel Palmer
From: Aaron Olivarez <aaron@olivarez.info>
Sent: Wednesday, May 5, 2021 5:08 PM
To: Joel Palmer <jpalmer@geontech.com>
Cc: discuss@lists.opencpi.org <discuss@lists.opencpi.org>
Subject: Re: [Discuss OpenCPI] Re: zcu104 testbias issuesThe file_read is being run on the embedded device which doesn't have access to the input file. If running in remote mode (ocpiremote) try forcing file_read and file_write to run on your host system.
ocpirun -v -d -x -m bias=hdl -p bias=biasvalue=0 -P file_read=ubuntu18_04 -P file_write=ubuntu18_04 testbias.xml
On Wed, May 5, 2021 at 3:27 PM <jpalmer@geontech.com> wrote:
After performing the steps again on the zcu104 the following log file was generated in the sandbox directory.
Discovery options: discoverable: 0, loopback: 0, onlyloopback: 0
Container server at <ANY>:12345
Available TCP server addresses are:
On interface eth0: 192.168.0.10:12345
Artifacts stored/cached in the directory "artifacts", which will be retained on exit.
Containers offered to clients are:
0: PL:0, model: hdl, os: , osVersion: , platform: zcu104
1: rcc0, model: rcc, os: linux, osVersion: 19_2, platform: xilinx19_2_aarch64
New client is "192.168.0.100:51782".
Shutting down client "192.168.0.100:51782" due to error: Code 0x17, level 0, error: 'Worker "file_read" produced an error during the "start" control operation: error opening file
Its also worth noting that this was presented once [$ ocpiremote start -b] was performed.
root@xilinx-zcu104-2019_2:~# [ 72.971915] opencpi: dma_set_coherent_mask failed for device ffffffc027f77400
[ 72.979120] opencpi: get_dma_memory failed in opencpi_init, trying fallback
[ 72.986108] NET: Registered protocol family 12
I have performed the exact set of commands and flow on the Zedboard and have been able to successfully run testbias. Can you confirm that testbias has worked on the zcu104 during regression testing of version v2.1.0 in any of the three modes (standalone, network, server)?
Thanks,
Joel Palmer
_______________________________________________
discuss mailing list -- discuss@lists.opencpi.org
To unsubscribe send an email to discuss-leave@lists.opencpi.org<pre class="moz-quote-pre" wrap="">_______________________________________________ discuss mailing list -- <a class="moz-txt-link-abbreviated" href="mailto:discuss@lists.opencpi.org">discuss@lists.opencpi.org</a> To unsubscribe send an email to <a class="moz-txt-link-abbreviated" href="mailto:discuss-leave@lists.opencpi.org">discuss-leave@lists.opencpi.org</a>