Lines Matching refs:the

23   -- Host never reads from the FPGA
24 -- Channels, pipes, and the message channel
41 or even a processor with its peripherals. FPGAs are the LEGO of hardware:
42 Based upon certain building blocks, you make your own toys the way you like
44 available on the market as a chipset, so FPGAs are mostly used when some
45 special functionality is needed, and the production volume is relatively low
46 (hence not justifying the development of an ASIC).
50 focus on their specific project, and not reinvent the wheel over and over
51 again, pre-designed building blocks, IP cores, are often used. These are the
55 building block, with electrical wires dangling on the sides for connection to
58 One of the daunting tasks in FPGA design is communicating with a fullblown
59 operating system (actually, with the processor running it): Implementing the
60 low-level bus protocol and the somewhat higher-level interface with the host
61 (registers, interrupts, DMA etc.) is a project in itself. When the FPGA's
63 make sense to design the FPGA's interface logic specifically for the project.
64 A special driver is then written to present the FPGA as a well-known interface
65 to the kernel and/or user space. In that case, there is no reason to treat the
66 FPGA differently than any device on the bus.
68 It's however common that the desired data communication doesn't fit any well-
69 known peripheral function. Also, the effort of designing an elegant
70 abstraction for the data exchange is often considered too big. In those cases,
72 effectively written as a user space program, leaving the kernel space part
74 interface logic for the FPGA, and write a simple ad-hoc driver for the kernel.
80 elementary data transport between an FPGA and the host, providing pipe-like
83 have the project-specific part of the driver running in a user-space program.
85 Since the communication requirements may vary significantly from one FPGA
86 project to another (the number of data pipes needed in each direction and
87 their attributes), there isn't one specific chunk of logic being the Xillybus
88 IP core. Rather, the IP core is configured and built based upon a
92 communication to the user. At the host side, a character device file is used
93 just like any pipe file. On the FPGA side, hardware FIFOs are used to stream
94 the data. This is contrary to a common method of communicating through fixed-
95 sized buffers (even though such buffers are used by Xillybus under the hood).
97 also no more than one, depending on the configuration.
99 In order to ease the deployment of the Xillybus IP core, it contains a simple
100 data structure which completely defines the core's configuration. The Linux
102 up the DMA buffers and character devices accordingly. As a result, a single
103 driver is used to work out of the box with any Xillybus IP core.
106 configuration space or the Flattened Device Tree.
114 On the host, all interface with Xillybus is done through /dev/xillybus_*
115 device files, which are generated automatically as the drivers loads. The
116 names of these files depend on the IP core that is loaded in the FPGA (see
117 Probing below). To communicate with the FPGA, open the device file that
118 corresponds to the hardware FIFO you want to send data or receive data from,
126 possibly pressing CTRL-C as some stage, even though the xillybus_* pipes have
127 the capability to send an EOF (but may not use it).
140 "channel" structure in the implementation code).
145 Xillybus pipes are configured (on the IP core) to be either synchronous or
147 some data has been submitted and acknowledged by the FPGA. This slows down
149 require data at a constant rate: There is no data transmitted to the FPGA
150 between write() calls, in particular when the process loses the CPU.
153 room in the buffers to store any of the data in the buffers.
155 For FPGA to host pipes, asynchronous pipes allow data transfer from the FPGA
156 as soon as the respective device file is opened, regardless of if the data
157 has been requested by a read() call. On synchronous pipes, only the amount
160 In summary, for synchronous pipes, data between the host and FPGA is
161 transmitted only to satisfy the read() or write() call currently handled
162 by the driver, and those calls wait for the transmission to complete before
165 Note that the synchronization attribute has nothing to do with the possibility
173 A synchronous pipe can be configured to have the stream's position exposed
174 to the user logic at the FPGA. Such a pipe is also seekable on the host API.
175 With this feature, a memory or register interface can be attached on the
176 FPGA side to the seekable stream. Reading or writing to a certain address in
177 the attached memory is done by seeking to the desired address, and calling
188 that depend on the specific bus interface (xillybus_of.c and xillybus_pcie.c).
191 the kernel. Since the DMA mapping and synchronization functions, which are bus
192 dependent by their nature, are used by the core module, a
193 xilly_endpoint_hardware structure is passed to the core module on
195 which execute the DMA-related operations on the bus.
200 Each pipe has a number of attributes which are set when the FPGA component
201 (IP core) is built. They are fetched from the IDT (the data structure which
202 defines the core's configuration, see Probing below) by xilly_setupchannels()
206 host pipe (the FPGA "writes").
208 * channelnum: The pipe's identification number in communication between the
214 applies) may return with less than the requested number of bytes. The common
217 * synchronous: A non-zero value means that the pipe is synchronous. See
224 * exclusive_open: A non-zero value forces exclusive opening of the associated
225 device file. If the device file is bidirectional, and already opened only in
226 one direction, the opposite direction may be opened once.
228 * seekable: A non-zero value indicates that the pipe is seekable. See
231 * supports_nonempty: A non-zero value (which is typical) indicates that the
232 hardware will send the messages that are necessary to support select() and
235 Host never reads from the FPGA
239 doesn't expect a card to go away all of the sudden. But since the PCIe card
240 is based upon reprogrammable logic, a sudden disappearance from the bus is
241 quite likely as a result of an accidental reprogramming of the FPGA while the
243 if the host attempts to read from an address that is mapped to the PCI Express
244 device, that leads to an immediate freeze of the system on some motherboards,
245 even though the PCIe standard requires a graceful recovery.
247 In order to avoid these freezes, the Xillybus driver refrains completely from
248 reading from the device's register space. All communication from the FPGA to
249 the host is done through DMA. In particular, the Interrupt Service Routine
250 doesn't follow the common practice of checking a status register when it's
251 invoked. Rather, the FPGA prepares a small buffer which contains short
252 messages, which inform the host what the interrupt was about.
254 This mechanism is used on non-PCIe buses as well for the sake of uniformity.
257 Channels, pipes, and the message channel
260 Each of the (possibly bidirectional) pipes presented to the user is allocated
261 a data channel between the FPGA and the host. The distinction between channels
263 related messages from the FPGA, and has no pipe attached to it.
268 Even though a non-segmented data stream is presented to the user at both
269 sides, the implementation relies on a set of DMA buffers which is allocated
270 for each channel. For the sake of illustration, let's take the FPGA to host
271 direction: As data streams into the respective channel's interface in the
272 FPGA, the Xillybus IP core writes it to one of the DMA buffers. When the
273 buffer is full, the FPGA informs the host about that (appending a
275 necessary). The host responds by making the data available for reading through
276 the character device. When all data has been read, the host writes on the
277 the FPGA's buffer control register, allowing the buffer's overwriting. Flow
280 This is not good enough for creating a TCP/IP-like stream: If the data flow
281 stops momentarily before a DMA buffer is filled, the intuitive expectation is
282 that the partial data in buffer will arrive anyhow, despite the buffer not
283 being completed. This is implemented by adding a field in the
284 XILLYMSG_OPCODE_RELEASEBUF message, through which the FPGA informs not just
287 But the FPGA will submit a partially filled buffer only if directed to do so
288 by the host. This situation occurs when the read() method has been blocking
289 for XILLY_RX_TIMEOUT jiffies (currently 10 ms), after which the host commands
290 the FPGA to submit a DMA buffer as soon as it can. This timeout mechanism
294 A similar setting is used in the host to FPGA direction. The handling of
295 partial DMA buffers is somewhat different, though. The user can tell the
296 driver to submit all data it has in the buffers to the FPGA, by issuing a
297 write() with the byte count set to zero. This is similar to a flush request,
299 an equivalent flush roughly XILLY_RX_TIMEOUT jiffies after the last write().
300 This allows the user to be oblivious about the underlying buffering mechanism
303 Note that the issue of partial buffer flushing is irrelevant for pipes having
304 the "synchronous" attribute nonzero, since synchronous pipes don't allow data
305 to lay around in the DMA buffers between read() and write() anyhow.
310 The data arrives or is sent at the FPGA as 8, 16 or 32 bit wide words, as
311 configured by the "format" attribute. Whenever possible, the driver attempts
312 to hide this when the pipe is accessed differently from its natural alignment.
315 will also work, but the driver can't send partially completed words to the
316 FPGA, so the transmission of up to one word may be held until it's fully
319 This somewhat complicates the handling of host to FPGA streams, because
321 the FPGA, and hence can't be sent. To prevent loss of data, these leftover
322 bytes need to be moved to the next buffer. The parts in xillybus_core.c
328 As mentioned earlier, the number of pipes that are created when the driver
329 loads and their attributes depend on the Xillybus IP core in the FPGA. During
330 the driver's initialization, a blob containing configuration info, the
331 Interface Description Table (IDT), is sent from the FPGA to the host. The
334 1. Acquire the length of the IDT, so a buffer can be allocated for it. This
335 is done by sending a quiesce command to the device, since the acknowledge
336 for this command contains the IDT's buffer length.
338 2. Acquire the IDT itself.
340 3. Create the interfaces according to the IDT.
345 In order to simplify the logic that prevents illegal boundary crossings of
346 PCIe packets, the following rule applies: If a buffer is smaller than 4kB,
349 pages from the kernel, and diving them into DMA buffers as necessary. Since
353 All buffers are allocated when the driver is loaded. This is necessary,
355 which are more likely to be available when the system is freshly booted.
357 The allocation of buffer memory takes place in the same order they appear in
358 the IDT. The driver relies on a rule that the pipes are sorted with decreasing
359 buffer size in the IDT. If a requested buffer is larger or equal to a page,
360 the necessary number of pages is requested from the kernel, and these are
361 used for this buffer. If the requested buffer is smaller than a page, one
362 single page is requested from the kernel, and that page is partially used.
363 Or, if there already is a partially used page at hand, the buffer is packed
364 into that page. It can be shown that all pages requested from the kernel
365 (except possibly for the last) are 100% utilized this way.
370 In order to support the "poll" method (and hence select() ), there is a small
371 catch regarding the FPGA to host direction: The FPGA may have filled a DMA
372 buffer with some data, but not submitted that buffer. If the host waited for
373 the buffer's submission by the FPGA, there would be a possibility that the
374 FPGA side has sent data, but a select() call would still block, because the
376 XILLYMSG_OPCODE_NONEMPTY messages sent by the FPGA when a channel goes from