Hello,
Is there a particular reason why the ocpi_buffer_size_*
properties are restricted to 16 bit?
This results in quite a small limit on the amount of data that can be sent over a single buffer.
I am trying to send the output of an FFT (as floats) and this is limiting the size of the FFT I can do to 4096. (When sticking to powers of 2)
I would have expected that even with the limit in place I would be able to send 8192 floats in a single buffer but when I set the sequenceLength
property in the protocol to 8192 I get this error:
Exiting for exception: Value for property "ocpi_buffer_size_out" of instance "fft" of component "local.uhd.uhd.fft" is invalid for its type: for property ocpi_buffer_size_out: Expression value (6.5536e4) is out of range for UShort type properties (0 to 6.5535e4)
I know that I could serialise the data and send it over multiple buffers however this would introduce additional time and space complexity copying memory in and out of extra buffers.
Thanks in advance,
Dan
Hi,
The following is what I believe to be true, but shouldn't be taken as
authoritative.
Firstly, this will depend on what protocol you are using.
You mention FFT (and it sounds like a non-complex one), so I'm going to
assume you are doing streams of float in and streams of float out.
A suitable protocol therefore might be float_timed_sample
:
specs/float_timed_sample-prot.xml
· develop · OpenCPI / OpenCPI Component Library Projects / SDR · GitLab
https://gitlab.com/opencpi/comp/ocpi.comp.sdr/-/blob/develop/specs/float_timed_sample-prot.xml
This protocol limits its sequences to a length of 4096.
4096 * 4 bytes per float = 16384 bytes total.
You will find that all the timed_sample
protocols in ocpi.comp.sdr
obey
this same limit.
I had to go searching for why this is, although I knew this was a hard
limit.
It's related to the structure of the headers sent over the Scalable Data
Plane (SDP).
Section 5.4.10 in the Platform Development Guide talks about this: OCPI_ODT
(opencpi.gitlab.io)
https://opencpi.gitlab.io/releases/v2.5.0-beta.1/docs/OpenCPI_Platform_Development_Guide.pdf
Table 8 on page 64 shows a record element called count
which is the
number of bytes transferred.
Pg 67 then says "The maximum count allows for 16KB".
This limit can be found in the ocpi.core.sdp
primitive library here:
projects/core/hdl/primitives/sdp/sdp_pkg.vhd
· develop · OpenCPI / OpenCPI · GitLab
https://gitlab.com/opencpi/opencpi/-/blob/develop/projects/core/hdl/primitives/sdp/sdp_pkg.vhd#L28
This limit was chosen so that the SDP could accommodate transmission of at
least a full Ethernet Jumbo Frame (9K).
All component implementations in OpenCPI are asked to obey the same rules
regarding their protocols.
As such, RCC is bound by the same limitations as HDL.
With regards to getting round this limitation, it sounds like you need to
make use of the take
method on your input port with send
on your output.
take
is defined in Section 4.4.9 of the RCC Development Guide: OpenCPI
RCC Development
https://opencpi.gitlab.io/releases/v2.5.0-beta.1/docs/OpenCPI_RCC_Development_Guide.pdf
Basically, the current buffer on the port is handed to the worker as a
pointer. The worker then owns this buffer.
This does not copy the data, and the documentation calls out that this is
how "sliding window algorithms" should be implemented.
send
is defined in 4.4.6 and talks about how it should be used to "effect
a zero copy transfer".
There are various combinations of take
, send
, release
, advance
, and
request
that could get to your desired outcome.
Although I'm not convinced that zero copy is achievable (unless there's
some way to merge buffer data pointers).
The simplest solution (although not the most efficient) would be to take
as many buffers as you need, do your fft (however this is achieved), then
copy the results back into the take
n buffers and send
them all.
This results in two copies (one into the FFT function, one out of it).
If you choose not to use take
and send
and go for a full advance
approach, I think you are still two copies (one into the FFT function, one
out of it).
I strongly advise reading the documentation of take
and send
as it is
very easy to trip up using them.
You will also need to look at RunCondition
s if you go down that route.
Kind Regards,
D. Walters
On Fri, Jun 16, 2023 at 10:01 AM dwp@md1tech.co.uk wrote:
Hello,
Is there a particular reason why the ocpi_buffer_size_* properties are
restricted to 16 bit?
This results in quite a small limit on the amount of data that can be sent
over a single buffer.
I am trying to send the output of an FFT (as floats) and this is limiting
the size of the FFT I can do to 4096. (When sticking to powers of 2)
I would have expected that even with the limit in place I would be able to
send 8192 floats in a single buffer but when I set the sequenceLength
property in the protocol to 8192 I get this error:
Exiting for exception: Value for property "ocpi_buffer_size_out" of
instance "fft" of component "local.uhd.uhd.fft" is invalid for its type:
for property ocpi_buffer_size_out: Expression value (6.5536e4) is out of
range for UShort type properties (0 to 6.5535e4)
I know that I could serialise the data and send it over multiple buffers
however this would introduce additional time and space complexity copying
memory in and out of extra buffers.
Thanks in advance,
Dan
discuss mailing list -- discuss@lists.opencpi.org
To unsubscribe send an email to discuss-leave@lists.opencpi.org
Thank you Dom I will look into using the take
method.
Hi Dom/Dan,
A few additional comments here.
The description of the SDP limitations is correct (big enough for jumbo
frames), but when the SDP sends message buffers, it 'segments" the
buffers into potentially smaller SDP packets.
So the SDP header is used for segments, not the higher level messages.
The source code for SDP uses the term "messages" sometimes when it
should use "segments" or "packets", which is confusing.
The current HDL infrastructure code actually has a message size limit of
2^21-1 (the actual length-in-bytes fields are 21 bits).
But there is indeed a separate limitation of the ocpi_buffer_size
property's data type (ushort) for HDL workers.
8K floats should work since that implies a required buffer size of 32K,
which should fit in a ushort.
The HDL buffer size limit of 64K-1 is primarily due to tradeoffs in the
use of FPGA BRAM resources, which is another discussion.
I do not believe this UShort buffer size limitation applies to RCC workers.
Cheers,
Jim
On 6/16/23 8:02 AM, Dominic Walters via discuss wrote:
Hi,
The following is what I believe to be true, but shouldn't be taken as
authoritative.
Firstly, this will depend on what protocol you are using.
You mention FFT (and it sounds like a non-complex one), so I'm going
to assume you are doing streams of float in and streams of float out.
A suitable protocol therefore might be float_timed_sample
:
specs/float_timed_sample-prot.xml · develop · OpenCPI / OpenCPI
Component Library Projects / SDR · GitLab
https://gitlab.com/opencpi/comp/ocpi.comp.sdr/-/blob/develop/specs/float_timed_sample-prot.xml
This protocol limits its sequences to a length of 4096.
4096 * 4 bytes per float = 16384 bytes total.
You will find that all the timed_sample
protocols in ocpi.comp.sdr
obey this same limit.
I had to go searching for why this is, although I knew this was a hard
limit.
It's related to the structure of the headers sent over the Scalable
Data Plane (SDP).
Section 5.4.10 in the Platform Development Guide talks about this:
OCPI_ODT (opencpi.gitlab.io)
https://opencpi.gitlab.io/releases/v2.5.0-beta.1/docs/OpenCPI_Platform_Development_Guide.pdf
Table 8 on page 64 shows a record element called count
which is the
number of bytes transferred.
Pg 67 then says "The maximum count allows for 16KB".
This limit can be found in the ocpi.core.sdp
primitive library here:
projects/core/hdl/primitives/sdp/sdp_pkg.vhd · develop · OpenCPI /
OpenCPI · GitLab
https://gitlab.com/opencpi/opencpi/-/blob/develop/projects/core/hdl/primitives/sdp/sdp_pkg.vhd#L28
This limit was chosen so that the SDP could accommodate transmission
of at least a full Ethernet Jumbo Frame (9K).
All component implementations in OpenCPI are asked to obey the same
rules regarding their protocols.
As such, RCC is bound by the same limitations as HDL.
With regards to getting round this limitation, it sounds like you need
to make use of the take
method on your input port with send
on
your output.
take
is defined in Section 4.4.9 of the RCC Development Guide:
OpenCPI RCC Development
https://opencpi.gitlab.io/releases/v2.5.0-beta.1/docs/OpenCPI_RCC_Development_Guide.pdf
Basically, the current buffer on the port is handed to the worker as a
pointer. The worker then owns this buffer.
This does not copy the data, and the documentation calls out that this
is how "sliding window algorithms" should be implemented.
send
is defined in 4.4.6 and talks about how it should be used to
"effect a zero copy transfer".
There are various combinations of take
, send
, release
,
advance
, and request
that could get to your desired outcome.
Although I'm not convinced that zero copy is achievable (unless
there's some way to merge buffer data pointers).
The simplest solution (although not the most efficient) would be to
take
as many buffers as you need, do your fft (however this is
achieved), then copy the results back into the take
n buffers and
send
them all.
This results in two copies (one into the FFT function, one out of it).
If you choose not to use take
and send
and go for a full advance
approach, I think you are still two copies (one into the FFT function,
one out of it).
I strongly advise reading the documentation of take
and send
as it
is very easy to trip up using them.
You will also need to look at RunCondition
s if you go down that route.
Kind Regards,
D. Walters
On Fri, Jun 16, 2023 at 10:01 AM dwp@md1tech.co.uk wrote:
Hello,
Is there a particular reason why the |ocpi_buffer_size_*|
properties are restricted to 16 bit?
This results in quite a small limit on the amount of data that can
be sent over a single buffer.
I am trying to send the output of an FFT (as floats) and this is
limiting the size of the FFT I can do to 4096. (When sticking to
powers of 2)
I would have expected that even with the limit in place I would be
able to send 8192 floats in a single buffer but when I set the
|sequenceLength| property in the protocol to 8192 I get this error:
|Exiting for exception: Value for property "ocpi_buffer_size_out"
of instance "fft" of component "local.uhd.uhd.fft" is invalid for
its type: for property ocpi_buffer_size_out: Expression value
(6.5536e4) is out of range for UShort type properties (0 to 6.5535e4)|
I know that I could serialise the data and send it over multiple
buffers however this would introduce additional time and space
complexity copying memory in and out of extra buffers.
Thanks in advance,
Dan
_______________________________________________
discuss mailing list -- discuss@lists.opencpi.org
To unsubscribe send an email to discuss-leave@lists.opencpi.org
discuss mailing list --discuss@lists.opencpi.org
To unsubscribe send an email todiscuss-leave@lists.opencpi.org
Hi Jim,
The output of my FFT is complex so the storage requirements are doubled.
Also all of the workers I have are RCC and the limit on ocpi_buffer_size
still applies,
I can submit this as a gitlab issue if you believe it is not the intended behaviour.
I have being playing around with using take
and send
to achieve this. Note that there seems to be an issue with the take method being overloaded by a seemingly unrelated function. I did submit this as an issue and the workaround is to use in.RCCUserPort::take
so that the correct method is called.
Thanks,
Dan