Device Resources

The following lists show important hardware details of all supported hardware platforms of this VisualApplets version. For a detailed list, please check the data sheet of the individual product.

Hardware Configuration of Supported Platforms

imaFlex

Resource imaFlex CXP-12 Quad imaFlex CXP-12 Penta imaFlex 2 Dual 100
Vision Processor Xilinx UltraScale+ XCKU3P-FFVD900-1-E Xilinx UltraScale+ XCKU3P-FFVB676-1-E Xilinx UltraScale+ XCKU15P-FFVE1517-1-E
LUT 160679 161049 447992
Flip-Flop 323224 323216 895984
Block RAM (18k) 720 720 1776
URAM Blocks (288k) 48 48 128
Embedded Arithmetic Logic Unit (DSP48) 1368 1368 1800
RAM size 3 x 512 MiB DDR4 5 x 512 MiB DDR4 2 @ 5 x 1024 MiB DDR4
RAM Data Width 384 Bit 640 Bit 640 Bit
RAM Bandwidth total (shared) 14.4 GB/1 24.0 GB/s1 2 @ 24.0 GB/s2
Base Design Clock (default) 312.5 MHz3 312.5 MHz3 391.5 MHz4
Base Design Clock (maximal) 400.0 MHz 400.0 MHz 420.0 MHz
Host Interface PCIe x 8 Gen 3 (Direct Memory Access) PCIe x 8 Gen 3 (Direct Memory Access) PCIe x 16 Gen 3 (Direct Memory Access)
Host Interface (PCIe x 8 Gen 3) Bandwidth (theor.) 8000 MB/s 8000 MB/s 16000 MB/s
Host Interface (PCIe x 8 Gen 3) Bandwidth (typ./max.) 7200 MB/s sustainable data bandwidth 7200 MB/s sustainable data bandwidth 13000 MB/s sustainable data bandwidth5

1

The platform includes a single physical RAM bank. Although each RAM-based operator uses a dedicated memory segment, the overall RAM size and memory bandwidth are shared across all operators. See also section ' Shared Memory Concept '.

2

The platform includes two single physical RAM banks. Although each RAM-based operator uses a dedicated memory segment, the overall RAM size and memory bandwidth are shared across all operators (maximum 8) docking on the same bank. See also section ' Shared Memory Concept '.

3

This platform allows to use a user-specified base clock. The minimum supported clock frequency is 312.5 MHz while the theoretical maximum is 400 MHz. Designs running at 312.5 MHz are generally expected to meet timing constraints. Frequencies above 312.5 MHz may lead to timing violations, depending on the implementation of the applet algorithms.

4

This platform allows to use a user-specified base clock. The minimum supported clock frequency is 391.5 MHz while the theoretical maximum is 420 MHz. Designs running at 391.5 MHz are generally expected to meet timing constraints. Frequencies above that value may lead to timing violations, depending on the implementation of the applet algorithms.

The minimum and maximal frequencies are not exactly 391.5 MHz and 420.0 MHz, but an infinite value close to these values. The minimal frequency is just above 391.5 MHz, and the maximal frequency is just below 420.0 MHz. This is due to simplified visibility.

5

The value depends on the PCIe bridge capabilities of the host PC. On most modern PCs, a maximum payload packet size of 256 bytes typically results in this value. Increasing the payload size improves bandwidth efficiency.

Table 68. Hardware Configuration imaFlex platforms


microEnable 5 marathon

Resource mE5 marathon VCX-QP mE5 marathon VCL mE5 marathon VCLx
Vision Processor Xilinx Kintex7 XC7K160T - 2FFG676C FPGA Xilinx Kintex7 XC7K160T - 1FBG676C FPGA Xilinx Kintex7 XC7K410T - 1FBG676C FPGA
LUT 101400 101400 254200
Flip-Flop 202800 202800 508400
Block RAM (18k) 650 650 1590
Embedded Arithmetic Logic Unit (DSP48) 600 600 1540
RAM size 4 x 512MiB DDR3 4 x 512MiB DDR3 4 x 512MiB DDR3
RAM Data Width 512 Bit 256 Bit 256 Bit
RAM Bandwidth total (shared) 12.8 GB/s1 6.4 GB/s1 6.4 GB/s1
Base Design Clock (default) 125MHz 125MHz 125MHz
Base Design Clock (maximal) 312.5 MHz2 312.5 MHz2 312.5 MHz2
Host Interface PCIe x 4 Gen 2 (Direct Memory Access) PCIe x 4 Gen 2 (Direct Memory Access) PCIe x 4 Gen 2 (Direct Memory Access)
Host Interface (PCIe x 4 Gen 2) Bandwidth (theor.) 1x2000 MB/s 1x2000 MB/s 1x2000 MB/s
Host Interface (PCIe x 4 Gen 2) Bandwidth (typ./max.) Up 1800 MB/s sustainable data bandwidth Up to 1800 MB/s sustainable data bandwidth Up to 1800 MB/s sustainable data bandwidth

1

The platforms own only one single physical RAM bank which is formatted as 4 independent, non-overlapping memory regions. Though the memory itself is exclusive for each RAM based operator, the RAM bandwidth is shared. See section ' Shared Memory Concept '.

2

These platforms allow to use a user-specified base clock. The minimum clock frequency is 125 MHz. Theoretical maximum is 312.5 MHz. Designs with a clock frequency of 125 MHz are likely to meet the timing constraints. Designs with a clock frequency above 125 MHz may result in timing constraints violations.

Table 69. Hardware Configuration microEnable 5 marathon


Device Resources of Supported Platforms

Device resources are limited on each hardware platform. The lists below show the available resources for all supported platforms. Operators consume device resources, and most resource instances can be used only once. There are some exceptions to this rule, e.g., the GPI operator. In these cases, resource consumption is documented in the operator reference documentation.

Device resources are allocated either

  • automatically,

  • using operator parameters, or

  • in the Resources dialog.

See 'Allocation of Device Resources' for more information.

imaFlex 2 Dual 100

Resource imaFlex 2 Dual 100
Port[0] CoF Lane1 4
Port[1] CoF Lane1 4
Port[0] CoF RX Trigger Lane1 4
Port[1] CoF RX Trigger Lane1 4
Port[0] CoF TX Trigger Lane1 4
Port[1] CoF TX Trigger Lane1 4
Port[0] CoF Status Lane1 4
Port[1] CoF Status Lane1 4
Port[0] DF RX Data Lane1 4
Port[1] DF RX Data Lane1 4
Port[0] DF RX Meta Lane1 4
Port[1] DF RX Meta Lane1 4
Port[0] DF TX Data Lane1 4
Port[1] DF TX Data Lane1 4
Port[0] DF TX Meta Lane1 4
Port[1] DF TX Meta Lane1 4
Port[0] DF Green LED Lane1 4
Port[0] DF Red LED Lane1 4
Port[1] DF Green LED1 1
Port[1] DF Red LED1 1
DmaToHostPort1 5
DmaFromHostPort1 1
GPI1 16
GPO1 16
EventPort2 31
EventID2 64
User LED1 6
ImageChannel3 1024
RAM4 2 @ from 1 x 5 GiB to 8 x 625 MiB

1

These resources are read only for control and are managed by the operators through parameters.

2

EventID refers to the maximum number of events supported by the software for the given platform. EventPort represents the event channel. Each channel can handle up to 16 events.

3

Resource ImageChannel allows to connect TxImageLink operators with RxImageLink operators. Each operator TxImageLink needs one resource ImageChannel exclusively. Each resource ImageChannel can be connected to exactly one operator RxImageLink, i.e., a maximum of 1024 TxImageLink and 1024 RxImageLink operators can be used in one design. Resource ImageChannel is visible in the Resources dialog.

4

RAM interface is divided into 2 physical RAM banks. Each RAM bank can be shared across 8 RAM based operators in bandwidth and size. See section ' Shared Memory Concept ' for more details.

Table 70. List of Device Resources imaFlex 2 Dual 100


imaFlex CXP-12 Quad and imaFlex CXP-12 Penta

Resource imaFlex CXP-12 Quad imaFlex CXP-12 Penta
Camera Port 4 5
CxpStatusPort 4 5
CxpRxTriggerPort 4 5
CxpTxTriggerPort 4 5
DMA 4 5
DmaFromHostPort 1 1
GPO1 10 OUT 12 OUT
GPI2 12 IN 12 IN
LED Ports1 6 6
SignalChannel3 4000 4000
EventPort4 32 32
EventID4 64 64
ImageChannel5 1024 1024
RAM6 from 1 x 1.5 GiB to 8 x 192 MiB from 1 x 2.5 GiB to 8 x 320 MiB

1

These resources are not visible in the Resources dialog. They are controlled via operators.

2

GPI is not visible in the Resources dialog. The same resource can be used multiple times. The table lists the amount of GPI ports.

3

Resource SignalChannel allows to connect TxSignalLink operators with RxSignalLink operators. Each operator TxSignalLink needs one resource SignalChannel exclusively. Multiple operators RxSignalLink can use the same resource SignalChannel, i.e., multiple operators RxSignalLink can receive the signals transmitted by one operator TxSignalLink. A maximum of 4000 TxSignalLink operators can be used in a design. The number of RxSignalLink operators is not restricted. Resource SignalChannel is visible in the Resources dialog.

4

EventID refers to the maximum number of events supported by the software for the given platform. EventPort represents the event channel. Each channel can handle up to 16 events.

5

Resource ImageChannel allows to connect TxImageLink operators with RxImageLink operators. Each operator TxImageLink needs one resource ImageChannel exclusively. Each resource ImageChannel can be connected to exactly one operator RxImageLink, i.e., a maximum of 1024 TxImageLink and 1024 RxImageLink operators can be used in one design. Resource ImageChannel is visible in the Resources dialog.

6

RAM interface is shared across all RAM based operators in bandwidth and size. See section ' Shared Memory Concept ' for more details.

Table 71. List of Device Resources imaFlex CXP-12 Quad and imaFlex CXP-12 Penta


microEnable 5 marathon

Resource mE5 marathon VCX-QP mE5 marathon VCL mE5 marathon VCLx
Camera Port 4 2 2
CameraControl - 2 2
DMA 4 4 4
GPO1 10 OUT 10 OUT 10 OUT
GPI2 12 IN 12 IN 12 IN
RAM 4 x 512 MiB 4 x 512 MiB 4 x 512 MiB
LED Ports1 4 2 2
SignalChannel3 4000 4000 4000
EventPort4 14 10 10
EventID4 64 64 64
ImageChannel5 1024 1024 1024

1

These resources are not visible in the Resources dialog. They are controlled via operators.

2

GPI is not visible in the Resources dialog. The same resource can be used multiple times. The table lists the amount of GPI ports.

3

Resource SignalChannel allows to connect TxSignalLink operators with RxSignalLink operators. Each operator TxSignalLink needs one resource SignalChannel exclusively. Multiple operators RxSignalLink can use the same resource SignalChannel, i.e., multiple operators RxSignalLink can receive the signals transmitted by one operator TxSignalLink. A maximum of 4000 TxSignalLink operators can be used in a design. The number of RxSignalLink operators is not restricted. Resource SignalChannel is visible in the Resources dialog.

4

EventID stands for maximal amount of events supported by the software for the particular platform. EventPort represents the event channel. One event channel can host up to 16 events.

5

Resource ImageChannel allows to connect TxImageLink operators with RxImageLink operators. Each operator TxImageLink needs one resource ImageChannel exclusively. Each resource ImageChannel can be connected to exactly one operator RxImageLink, i.e., a maximum of 1024 TxImageLink and 1024 RxImageLink operators can be used in one design. Resource ImageChannel is visible in the Resources dialog.

Table 72. List of Device Resources microEnable 5 marathon


Shared Memory Concept

imaFlex 2 Dual 100

The imaFlex 2 Dual 100 platform contains two separate physical RAM banks, each with a capacity of 5 GiB. These physical banks are divided into non-overlapping memory regions as needed, based on the number of RAM-based VisualApplets operators used in an applet. Within VisualApplets, these regions appear as virtual RAM banks. When an operator requires RAM, it reserves a virtual RAM bank, which corresponds to a dedicated, non-overlapping region within the physical RAM.

On the imaFlex 2 Dual 100 platform, you can define up to 8 non-overlapping memory regions per physical RAM bank, which means a total of 16 regions across both RAM interfaces. If only one RAM operator is assigned to a physical interface, it has access to the entire 5 GiB capacity of that interface. When multiple operators share the same physical interface, the available memory is divided proportionally among them. For example, if 8 operators are assigned to the same interface, each operator receives 1/8 of the 5 GiB memory, resulting in approximately 0.625 GiB per operator.

The RAM bandwidth on the imaFlex 2 Dual 100 is shared evenly among all operators connected to the same physical interface; it is not exclusive to any single operator.

  • If a design uses all 8 RAM resources on one interface, each RAM-based operator will receive approximately 1/8 of the total interface bandwidth, adjusted by the operator’s efficiency factor.

  • If only one RAM-based operator is used, it can utilize the full maximum bandwidth of that interface.

  • If two operators share the same interface, each will receive half of the total bandwidth, and so on.

The 2 physical RAM interfaces of the platform are mapped to 16 virtual RAM ports in VisualApplets. The first RAM interface is mapped to the RAM resource index 0 to 7, and the second is mapped to the RAM resource index 8 to 15.

RAM Size Distribution Across RAM Ports

RAM ports that are not allocated (and therefore not used) always receive 0% of the memory capacity. This means the total available memory is divided only among the operators that are actively using the RAM ports.

  • If only 1 port is used: That port receives 100% of the available memory.

  • If 2 ports are used: Each port receives 50% of the available memory.

  • If 3 ports are used: The port with the lowest resource ID receives 50%, and the other 2 ports each receive 25%.

  • If 4 ports are used: All 4 ports receive 25% each.

  • If 5 ports are used: The first 3 ports (with the lowest resource IDs) each receive 25%, and the remaining 2 ports each receive 12.5%.

  • If 6 ports are used: The 2 ports with the lowest resource IDs each receive 25%, and the other 4 ports each receive 12.5%.

  • If 7 ports are used: The port with the lowest resource ID receives 25%, and the other 6 ports each receive 12.5%.

  • If 8 ports are used: All ports receive 12.5% each.

[Note] Asymmetric Memory Allocation

Ports with the lowest RAM index within the same physical bank receive a larger allocation in case of asymmetric size partitioning. For example, if 3 RAM ports are used (0, 1, and 2), port 0 receives 50% of the RAM size, while ports 1 and 2 each receive 25%. The same applies if the ports are 8, 9, and 10, as they map to the second RAM interface but still belong to the same physical RAM bank.

The RAM indexes do not need to be contiguous; their absolute order determines the allocation. The RAM index is a virtual identifier and does not affect FPGA resource usage, even if there are gaps in the sequence. For example, if 3 RAM ports are used (1, 5, and 7): Port 1 receives 50% of the RAM size, and ports 5 and 7 each receive 25%.

There is no advantage or disadvantage in how RAM indexes are allocated when all resources map to the same physical bank. You can use the automatic allocation provided by VisualApplets or adjust it manually if a specific design operator requires more RAM than others.

[Note] Independent RAM Interfaces

Both physical RAM interfaces are completely independent. For example, if two operators are used and one is assigned to the first RAM interface while the other is assigned to the second, each operator can utilize the full maximum RAM size of its respective interface.

RAM Bandwidth Distribution Across RAM Ports

The shared memory controller uses a Round-robin algorithm to distribute bandwidth evenly across all allocated ports within a physical RAM interface. This algorithm is based on a credit-based arbitration scheme. When a port becomes active, it remains active for a set number of credit clock cycles, as long as it continues to provide new RAM jobs. If an active port has no more jobs, it is deactivated, and the activation token moves to the next port with pending requests. This ensures that bandwidth is never wasted on idle ports.

The credit values are programmed by the firmware during applet synthesis, based on the number of RAM operators used. They can't be changed at runtime or through the Framegrabber SDK. Users have no access to credit configuration.

While a port is active, it has 100% of the memory controller bandwidth. If all 8 ports in a physical bank are active evenly, each port receives approximately 1/8 of the total bandwidth over time. In other cases, the actual average bandwidth depends on the load of each port.

The physical RAM interfaces are completely independent. Assigning RAM-based operators to different interfaces maximizes bandwidth efficiency. Bandwidth is shared only among operators that are connected to the same physical RAM interface.

RAM Size and Bandwidth Optimization

The imaFlex 2 Dual 100 provides 2 independent physical RAM interfaces. Each interface has its exclusive bandwidth and memory size. When maximal bandwidth and size is targeted in a design, place RAM-based operators evenly between both RAM interfaces. Ports 0 to 7 are mapped to the first RAM interface, and ports 8 to 15 are mapped to the second.

imaFlex CXP-12 Quad and imaFlex CXP-12 Penta

The platforms imaFlex CXP-12 Quad and imaFlex CXP-12 Penta are assembled with only one physical RAM bank (the size of which is platform-specific). This single physical bank is dynamically formatted into non-overlapping regions depending on the amount of used RAM-based VisualApplets operators inside the applet. Those regions are represented inside VisualApplets as virtual RAM banks. When an operator reserves a RAM resource, it is using a virtual RAM bank which maps to an exclusive non-overlapping memory region inside the physical RAM. Up to 8 non-overlapping regions can be defined on the imaFlex platforms. When only 1 RAM operator is used, the operator gets the complete RAM size of the platform. The allocated size for each additional operator reduces the operator’s owned size proportionally to the operator’s amount docking on the same physical interface. If 8 operators are used, each operator will allocated 1/8 of the platform memory size.

The RAM bandwidth, however, is shared between all RAM-based operators in a design. When a design utilizes all 8 RAM resources, each of the 8 RAM-based operators can have up to 1/8 GB/s exclusive bandwidth, minus the efficiency factor of that particular operator. When only one RAM-based operator is used in the design, this operator gets the total bandwidth of the platform. When 2 operators are used, each of the two operators gets half the total bandwidth, etc.

RAM Size Distribution Across RAM Ports

Not allocated and thus not used RAM ports get always 0% of the memory size.

  • 1 port is used: the used port gets 100%.

  • 2 ports are used: both used ports get 50%.

  • 3 ports are used: the port with the lowest resource ID number will get 50%, the other 2 used ports get 25%.

  • 4 ports are used: all 4 ports get 25%.

  • 5 ports are used: 3 ports with the lowest resource ID number will get 25%. The last 2 ports will get 12.5%.

  • 6 ports are used: 2 ports with the lowest resource ID number will get 25%. The other 4 ports will get 12.5%.

  • 7 ports are used: the port with the lowest resource ID number will get 25%. The other 6 ports will get 12.5%.

  • 8 ports are used: all ports get 12.5%.

[Note]

Ports with the lower RAM index will get larger allocation in case of asymmetric size partitioning. For example 3 RAM ports are used: 0, 1, 2. The port 0 gets 50% RAM size allocation. The ports 1 and 2 get both 25% each.

The RAM indexes do not need to be contiguous, their absolute order decides the allocation. The RAM index is a virtual number and has no impact on FPGA resource usage, when having gaps in ordering. For example 3 RAM ports are used: 1, 5, 7. The ports 1 gets 50% RAM size allocation. The ports 5 and 7 get both 25% each.

There is no advantage or disadvantage when allocating RAM indexes of the operators. Use either the automatic VisualApplets allocation or tune it manually, when a special design operator needs more RAM size than the other operators.

RAM Bandwidth Distribution Across RAM Ports

The shared memory controller applies the Round-robin algorithm and distributes the bandwidth evenly across all allocated ports. The algorithm is using the credits arbitration scheme: When a port gets active, it can stay active for the credit's clock cycles as long as the port provides new RAM jobs. Once a port is activated but has no more jobs, the port gets deactivated and the activation token jumps to the next port in line, which has request jobs pending. This way, the bandwidth is never wasted on idling. The credits are programmed by the firmware exclusively during synthesis of the applet depending on the amount of used RAM operators and can't be changed in the Framegrabber SDK/during runtime. The user has no access to the credit's programming. While an active port is owning the RAM interface, it will have access to 100% bandwidth of the memory controller. When all 8 ports are used and are active evenly, the resulting bandwidth over time is 1/8 for each port. In all other cases, the load on the ports defines the actual average bandwidth for each port.

microEnable 5 marathon

The microEnable 5 marathon platforms are assembled with only one physical RAM bank (the size of which is platform-specific). This single physical bank is formatted into 4 non-overlapping memory regions. These 4 regions are represented inside VisualApplets as 4 virtual RAM banks. When an operator reserves a RAM resource, it is using a virtual RAM bank which maps to an exclusive non-overlapping memory region inside the physical RAM.

The RAM bandwidth, however, is shared between all RAM based operators in a design. When a design utilizes all 4 RAM resources, each of the 4 RAM based operators can have up to 1.6 GB/s exclusive bandwidth, minus the efficiency factor of that particular operator. When only one RAM based operator is used in the design, this operator gets the total bandwidth of 6.4 GB/s. When 2 operators are used, each of the two operators gets half the total bandwidth, etc.

[Note] Bandwidth per Operator

The on-board RAM provides 6.4GB/s total bandwidth. The bandwidth available for an individual RAM based operator is the total bandwidth divided by the number of all instantiated RAM based operators in the design.

RAM architecture

Figure 441. RAM architecture


This RAM architecture needs to be considered when designing with RAM based operators.

Due to the shared bandwidth architecture, the applet developer should utilize all 256 bits of the operator’s memory interface (RAM Data Width) to achieve maximal throughput through the memory interface when using multiple RAM based operators even though the single RAM operator needs less bandwidth on its input.

Data Forwarding with imaFlex 2 Dual 100 Frame Grabbers

The imaFlex 2 Dual 100 offers 2 fiber QSFP28 connectors: C0 and C1.

imaFlex 2 Dual 100 C1 And C0 Connectors

Figure 442. imaFlex 2 Dual 100 C1 And C0 Connectors


Both connectors support CoF (CXP over Fiber) and data forwarding protocols. Each fiber connector has its own LEDs: Four LEDs are assigned to connector C0, and one LED is assigned to connector C1.

Each connector supports QSFP28 optical modules. A QSFP28 interface provides 4 fiber connections in both directions, i.e. 4 RX and 4 TX connections. A fiber connection is referred to as lane in the operator documentation. Each QSFP28 port has 4 fiber lanes in both RX and TX direction, labeled 0 to 3. Each lane operates at 25 Gbit/s. Thus, a single QSFP28 connector can transport 100 Gbit/s accumulated bandwidth in TX and RX direction.

QSFP28 Interface (illustrative)

Figure 443. QSFP28 Interface (illustrative)


imaFlex 2 Dual 100 offers data forwarding capabilities to transmit data bidirectionally across multiple frame grabbers. Data forwarding in VisualApplets supports various topologies to transport data and metadata, including daisy-chain and more advanced models. Data forwarding behavior is fully controlled by the user through VisualApplets operators. Camera data, applet‑generated data, and metadata can be transmitted independently in both TX and RX directions. VisualApplets also multiplexes lanes, allowing metadata and regular image data to be sent simultaneously over a single fiber lane using two separate virtual channels. Metadata is prioritized to ensure timely delivery.

Because fiber technology supports bidirectional communication, an advanced master–slave application can allow the master to control which data slave devices receive and to collect processing results from them. Processed results generated on a slave GPU or CPU can be transferred back via DmaFromPC and then forwarded through data forwarding data operators, back to the master frame grabber.

In a typical data forwarding daisy-chain configuration, selected camera data is forwarded to multiple frame grabbers. These frame grabbers then transmit the data to an external GPU over PCIe with very low latency.

Data Forwarding

Figure 444. Data Forwarding


Data Forwarding Port Connection

Figure 445. Data Forwarding Port Connection


CoF (CXP over Fiber) uses four downstream connections and one upstream connection. Data forwarding links between two frame grabbers are configured to transmit only the required data; they do not transport native CoF traffic but instead carry VisualApplets data and metadata streams. Both CoF and data forwarding are protected by forward error correction (FEC).

The master frame grabber connects to the camera and forwards the selected data parts to the next slave frame grabber. Each slave frame grabber extracts the data it needs for its own processing and forwards the remaining, unprocessed parts of the image stream to the next slave in the sequence. The last frame grabber in the chain acts as an endpoint and does not forward any data further.

The example above is a simplified representation of a daisy-chain topology. In more advanced configurations, two connected frame grabbers can exchange data freely: they can span 1-, 2-, 3-, or 4-lane connections between each other. RX and TX directions may use different numbers of lanes.

Multiple optical lanes can be combined into virtual channels, allowing several data channels to be transmitted over a single QSFP28 connector. In addition to image data transmission, VisualApplets provides a dedicated meta channel for each optical lane. This meta channel has priority over normal image data and is transmitted with very low jitter and latency, enabling applications to implement time stamping and board-to-board synchronization. Small side-band information can also be transmitted via the meta channel, for example to configure the slave partner on the opposite end of the fiber connection. The partner can return control and status information over its own meta channel back to the master. Each frame grabber can also send processed image data to its partner at an aggregated bandwidth of up to 100 Gb/s.

Up to four meta channels can be spanned between two frame grabbers over a single QSFP28 interface, independently in TX and RX directions. Each meta channel operates at 25 Gb/s. Image-data channels can span 1, 2, 3, or 4 optical lanes to form a virtual data-transmission channel.

A QSFP28 data forwarding interface allows the following combinations of virtual data channels:

  • 1 x4 channels: 100 Gbit/s

  • 1 x3 channels running at 75 Gbit/s and 1 x1 channel running at 25 Gbit/s

  • 2 x2 channels running at 50 Gbit/s each

  • 1 x2 channels running at 50 Gbit/s and 2 x1 channels running at 25 Gbit/s

  • 4 x1 channels running at 25 Gbit/s each

The RX and TX sides operate completely independently. This means that the backward (RX) direction can be implemented or omitted as needed. If it is implemented, it does not have to be symmetrical to the forward (TX) direction.

Meta channels and data channels operate independently. Each meta channel uses exactly one fiber lane and runs at 25 Gb/s. Meta channels are designed for transmitting small amounts of data. Because they have priority, a meta‑channel transmission temporarily pauses regular data transmission on the same fiber lane.

Data Forwarding Operators

To support data forwarding functionality, VisualApplets provides the following operators:

  • DFTxData – operator for transmitting image data, spanning 1–4 lanes.

  • DFRxData – operator for receiving image data, spanning 1–4 lanes.

  • DFTxMeta – operator for transmitting metadata, 1 lane.

  • DFRxMeta – operator for receiving metadata, 1 lane.

  • DFLed – operator for accessing the front-slot QSFP28 LEDs (bicolor: red, green, orange). C1 has 1 LED; C0 has 4 bicolor LEDs (one per lane).

Basler provides data forwarding examples as a VisualApplets design, along with a C++ SDK control application. In these examples you can see how to use these operators for data forwarding. The examples are available as VisualApplets examples at Examples/Acquisition/DataForwarding. You find documentation for these data forwarding examples in the 'Data Forwarding for CoaXPress over Fiber Frame Grabber' topic in the tutorial.

VA Design Example for a Master in a Data Forwarding Topology

Figure 446. VA Design Example for a Master in a Data Forwarding Topology


VA Design Example for a Slave Master in a Data Forwarding Topology

Figure 447. VA Design Example for a Slave Master in a Data Forwarding Topology


VA Design Example for an Endpoint Slave Master in a Data Forwarding Topology

Figure 448. VA Design Example for an Endpoint Slave Master in a Data Forwarding Topology


Optical Topologies

Under certain restrictions, more complex camera–frame grabber models are possible. The optical transmission between two frame grabbers has extremely low latency: approximately 120 ns for a data word to travel from one frame grabber to another.

Daisy Chain

A daisy‑chain topology is the most common approach for distributing processing loads across multiple CPU/GPU clusters:

Daisy Chain (illustrative)

Figure 449. Daisy Chain (illustrative)


The master frame grabber connects to the camera; the second QSFP28 port connects to the next slave. Each slave processes part of the incoming data and forwards the remaining data to the next device. The last frame grabber in the chain serves as the endpoint. All frame grabbers except the last use both QSFP28 ports.

Dual Camera

Dual Camera

Figure 450. Dual Camera


For some applications, it is possible to connect two 100‑Gb/s cameras to a single frame grabber. This requires that the cameras are not operating at their maximum bandwidth, or that pre‑processing in VisualApplets can be performed before storing the data in RAM to slightly reduce the bandwidth so both camera streams fit into the two RAM interfaces. In this configuration, the frame grabber must perform additional processing in VisualApplets to reduce the resulting data rate so it matches the limited PCIe throughput of approximately 13 GB/s.

Dual Master

Dual Master

Figure 451. Dual Master


In this case, each frame grabber connects to its own camera. The second QSFP28 port is used for inter‑frame‑grabber communication, enabling data exchange at up to 100 Gbit/s in both directions for complex computing tasks and synchronization.

Triple Master

Triple Master

Figure 452. Triple Master


In this topology, each frame grabber connects to its own dedicated camera. The central frame grabber uses its remaining QSFP28 port to connect to two other master frame grabbers via a special fiber cable that splits four lanes into two QSFP28 modules with two lanes each. The only direct connections for the two outer frame grabbers are the full four‑lane links to their respective cameras and the two‑lane links to the central frame grabber. These two lanes enable communication between the frame grabbers at speeds of up to 50 Gbit/s.