Tuesday, August 23, 2016

Prj 146 - Device Interface Software (Part 3)

The software to control and interface with the dual channel ADC board is fundamentally an SPI interface.  There are a couple of aspects which make this interesting.

First the primary SPI software is a function of the F board and was developed for multiple projects. This software provides a SPI cpu C++ and PRU interface.  The former toggles the SPI lines directly via Beaglebone black gpios while the later sends commands to PRU0 which toggles the io lines to the fpga.  The advantage is that the later can achieve SPI clock rates in the 9 – 16 MHz while the former is limited to kilohertz range.

The second interesting aspect is that the basic digital control block of the VHDL image not only provides registers to control the signal block and quadrature down conversion but also two ports of 6 pin gpio.  These ports can be used simultaneously by other boards and their controlling software.  The software needs to provide an independent mechanism to access these ports while not significantly impacting streaming of IQ ADC streams.

The final aspect of interest is that the down conversion software on the Beaglebone Black  provides a standard IQ ADC interface to other software applications .  In the cases where the 40MSP and 4MSPS channels are selected the cpu cannot process the full set of streaming data.  In the case of the 200kHz channel (2 bytes/sample or 400kB/s) the processor can stream the channel to a network SDR application.

To address the above considerations, the software was structured along the lines of the following diagram.
Software Organization
Fboard
Starting at the bottom, PRU0 controls the SPI pins to the FPGA.  Based on simple throughput measurements the PRU can drive a SPI clock of ~ 9MHz driving 1x SPI width.  This results in ~515 kwords/second measured (with a word being 16 bits).  The 2x SPI would give slightly less than 2x performance (due to additional instructions for manipulating two bits at a time).  PRU0 has two command mail boxes in its SRAM which it monitors.  They are serviced round robin.  Each command contains an operation, a number of words/bytes, and the words/bytes to transfer.  As the words are shifted onto the SPI interface, the values shifted in are placed in the SRAM mailbox.  One mail box is used by the CPU to request SPI transfers while the other can be used by PRU1.

BDC
The CPU interface SpiXferArray16 is used by the BDC software to read and write 16 bit registers.  The writes can occur in a SpiXferArray16 of length one, while reads require two operations.  Register reads are conducted this way so that a read command (command in first word, and results obtained in second) are not split across transfers done by the PRU1 (or visa versa).  The BDC software SpiRead16/Write16 is used by the QDC100 software to configure the quadrature conversion as well as the gpio ports implemented by the BDC software.

Tboard , Mboard , DDC/QDC
The BDC gpio interface is used by the user space I2C (UI2C) software to support the devices on the T board (MAX2112 tuner, and MCP4725 DAC for variable gain amplifier), while the BDC gpio interface is also used by the SPI interface within the ADF4351 control software used by the M board (synthesizer section).

Concurrent with all of the above operations PRU1 under the control of the QDC100 software continuously requests PRU0 to conduct SPI read operations to the FPGA.  These operations are: a) read fifo samples in sets of 256, and b) read the sample fifo threshold register to ensure it is not empty (and can safely be read).   PRU1 places the samples it streams from the fpga fifo to a DRAM circular buffer shared with the cpu.  The QDC100 Get2kSamples() interface extracts those from the DRAM buffer and provides them to the calling application.

Other interfaces are present at all horizontal lines and not shown for clarity (e.g. the QDC100 provides an interface to select the channel to be streamed through the fifo and the digital downconversion frequency, the M board provides an interface to set the LO frequency, and the T board provides a tuning frequency for its LO, rf and baseband gain settings and baseband filter cutoff frequency).

The above structure allows each of the pieces of software to support various environments and configurations while at the same time maintaining device independence and allowing concurrent operation and control of the devices.


Wednesday, July 20, 2016

Prj 146 - VHDL for Quadrature Downconverter (Part 2)

This post walks through a VHDL application for quadrature down conversion on a dual channel 40MSPS ADC. A previous project constructed a digital down converter with a single real input channel and CIC filters and decimators with compensation filter in a LX9.  For this project I wanted to do something similar but not use CIC filters and provide multiple stages of down conversion to allow variable bandwidth sampling.  The following is a block diagram of the VHDL image - referred to as QDC-100.  All of the VHDL discussed is here.
QDC-100 Summary Block Diagram
The digital control portion of the FPGA image is the BDC block described here.  It provides the SPI interface to the BeagleBone Black along with control and status discretes to the signal processing block. Internally it executes at 96MHz (using the DCM). All of the sample data from the signal block is transferred via a fifo and all of the control information to the signal block are discretes.
QDC-100 Signal Block Summary
The channel match and test block provides  a mux to select the actual ADC samples or a test signal and equalize the channels to compensate for DC bias and gain differences in the analog front end.  The mixer and LO stage is just as it sounds, a numerically controlled oscillator or QDC which is complex multiplied by the input signal.  The decimators are two stages of filter and decimate to bring the sample rate down to a manageable level.  Two stages are used to provide different bandwidths and reduce the filter requirements.  The final stage selects samples from any of the previous stages and writes them to a fifo in I/Q 16 bit signed format for extraction by the digital control block.  The fifo provides a minimum number of samples (2048) indicator and a flush/reset control.

The following is a block diagram of the channel match and test pattern generator section.
QDC-100 Channel Match and Test Pattern Generator
The test pattern generator creates a I/Q signal with a known level using a small DDS (8 bit phase, 12 bit output) and a shift right level control (0dB, -6dB, …).  The ability to use a known digital high quality signal is extremely helpful in testing the rest of the design.  If the TPG value is 0 the muxes select the ADC data, otherwise, they select the test pattern and the test pattern is determined by the TPG value.  The output of the muxes  for each I and Q channel is then run through an adder and divider to remove the DC offset bias and equalize the amplitude of the channels.  Removing the DC bias removes down stream carrier products of the subsequent mixer and equalizing the amplitudes (removing channel gain variation) reduces down stream images.  The gain equalization is accomplished by multiplying the signal by an 8 bit signed value and then selecting the high bits of the output.  This has the net effect of multiplying the signal by N/128, where N is the Inum or Qnum.  This allows small variations to be matched with essentially the higher gain signal being reduced by a small fraction.  Using a signed quantity has the added benefit of being able to switch the I/Q channels.  If we use a –N value for one channel it shifts the channel by 180 degrees.  If that channel lagged the other by 90 degrees, it now leads it by 90 degrees thus swapping I/Q.  The outputs are 12 bit I and Q values which have been equalized.  The entire block works on a sample clock basis.

The following is a block diagram of the mixer and LO section.
QDC-100 Mixer and Local Oscillator
This block is relatively straight forward and uses a Xilinx DDS and complex multiply generated cores. The DDS block generates an in phase (12 bits) and quadrature (12 bits) sinusoid which is fed to the multiplier.  The multipliers output is two 16 bit values (I and Q) and routed to subsequent stages as a 32 bit vector.   Since the PINC bits are a set of discrete lines from another clock domain, a small state machine – the AXI Slave writer monitors for values changes and writes the phase increment register of the DDS core. Again, the entire block works on a sample clock basis.

The following is a block diagram of the decimator section.
QDC-100 Two Stage Decimation
There are two decimator blocks.  Each operates on 16 bit inputs and outputs and has its own internal clock.  The decimators are Xilinx generated cores and include a FIR filter applied prior to the decimation.  The first filter decimates by 10 and the second by 20.  The following is a block diagram of a decimator block.  Both A and B are structurally the same with the differences being the decimation rate, filter applied, and internal clock rate used.

The following is an internal block diagram of a single decimation stage:
QDC-100 Decimation Stage Internals
Each decimator uses an independent filter clock derived from the sample clock.  A DCM is used to generate a filter clock multiple times the sample clock (e.g. from 40MHz to 200MHz).  Each decimator has input and output fifos with independent read and write clocks.  This isolates the higher frequency filter clock domain.  The higher rate filter clock allows a smaller number of DSPs to be used to generate a FIR filter and decimator with a larger number of taps than would be possible using a filter clock at the same rate as the sample clock.

The signal block output is simply a fifo.  The input to the fifo is a 32 bit wide interface (16 bits I, 16 bits Q), while the output is a 16 bit wide interface.  This takes advantage of the Xilinx generated fifo capability to have different input/output widths.  This allows single cycle writes of I and Q data during a single sample clock while allowing the digital processor interface to retain a 16 bit wide register/fifo read interface.  The following is a block diagram of the output.
QDC-100 Signal Block Sample Output to Digital Block
The mux select determines what gets written to the fifo.  Output from any of the stages or processing can be selected.  The input fifo clock is the sample clock with the write valid always being true for non-decimated inputs or the valid signal from the decimators.  The output fifo clock is the digital control clock with the read enable driven from the digital control register read block.

Related:

Wednesday, June 22, 2016

Prj 146 - Dual Channel 40MSPS ADC (Part1)

With the F board (BeagleBone Black Spartan6 LX9 FPGA), I now have the option of using both faster ADCs as well as dual channel variants to support quadrature down converters.  This board uses a LTC2292 which is a dual channel 40MSPS 12 bit ADC.  It is similar to previous LTC ADCs I have used (FPGA variant and non-FPGA).  The schematic of the board is shown below.
Dual ADC Schematic - LTC2292, IF inputs, and sample clock.
Dual ADC Schematic - Voltage Regulators
There are a couple of variants and decisions on the board worth noting.  First, I configured it to use 0.5V full scale rather than 1V and configured the output to be signed 12 bit values.  Second, the analog inputs are similar to previous versions using a center tap transformer with 50 ohm input termination. This comes directly from the manufacture reference design and has worked well in the past.  Finally, a 40MHz CMOS oscillator is used but with a buffer to the FPGA on the carrier board. Separate regulators are used for the analog and digital supply voltages.  Everything is 3.3V and liberally supplied with bypass capacitors and chokes. The Kicad source material for the the board is available on github here. The board is a two layer OSHPark with the layout shown below.
Dual ADC Two Layer PCB Layout
A picture of the first unit assembled and mounted on an F board (mounted on a BBB) is shown below.
Dual 40MSPS ADC board.  Connectors at right are from the underlying F board it is mounted on.  Connectors on the left are from the underlying Beagle Bone Black which the F board is mounted on.
For testing purposes the anti-aliasing low pass filter elements were not populated.  This board is designed to work with a digital down converter VHDL image on a Spartan 6 LX9 (F board).  The construction of this board is straightforward.  The only difference is that with this unit I tried applying solder paste using 22gauge plastic needle with paste syringe.  In previous work I only had the default metal large needle that comes with the paste syringe itself.  Using the fine plastic needles make a huge difference, you can dispense fine amounts of paste and let the needle touch the pads without fear of the metal scraping.

Related:
Prj 145 - Beagle Bone Black Simple LX9 FPGA board (Part 1)
Prj 145 - BBB LX9 FPGA Board Design (Part 2)
Prj 145 - BBB LX9 FPGA Board Construction (Part 3)
Prj 145 - BBB LX9 JTAG Boundary Scan Utilities (Part 4)
Prj 145 - BBB LX9 C++ and VHDL (Part5)

Thursday, June 2, 2016

Prj 144 - DVB Tuner Board

I wanted to use one of the existing DVB tuners for various RF applications.  The appeal of these ICs is that they include a high level of integration with quadrature mixers, amplifiers and filters, are cheap and easily accessible, come in small packages, and are relatively easy to use.

I finalized on the MAX2112.  Based on experience with other devices with high levels of integration I decided to start simple and build a small board based on the circuit in the manufacturer evaluation board (I would have just used their eval board, however, these are always incredibly expensive).  The following circuit captures that board.
DVB Tuner Board Schematic
Two separate low noise regulators are provided although one is sufficient given the low power of the device.  The MAX2112 has a 75 ohm input impedance so provisions are made for a resistive broadband input match (and associated loss of input power) or a LC tuned input match to bring the board input impedance to 50 ohms.  The loop filter for the synthesizer was copied from evaluation circuit without modification.  The differential outputs were converted to single ended using op-amps.  The only change from the evaluation board is the inclusion of a DAC to provide a programmable voltage to the AGC input of the device.  The DAC was selected to have an I2C address different from the tuner.

Since this was my first part using an I2C interface it took some time to develop and debug the software (Tboard and user space I2C).  The board was populated with only the I2C DAC and an LED on its output.  Normally adding an LED is not good practice as it can add noise (something not desirable on the input voltage to a high gain AGC amplifier), however, for testing purposes it proved very helpful. Initial development and testing was conducted using an I board and then updated to support an Fboard with BDC VHDL.  The first unit used a 20MHz crystal, while the loop filter values were specified for a 27MHz crystal from the evaluation board.  This worked out ok since via software control I was able to divide the reference oscillator by 2 and achieve lock.  Below is a picture of that unit.
MAX2112 Based Tuner Board (second regulator not populated)
The programming information is a little sparse.  If you have used a synthesizer before it makes sense but I would not choose this as the first part to work with a PLL (loop filter, lock debug).  The one subtlety was the initial value of the VCO filter registers.  If you change these from the power on default I had problems with the device locking.  It seems to conduct the VCO search in only one direction in frequency (this wasn’t entirely clear from the data sheet or reference board material).  Having overcome this, I was able to test both IF channels using RF inputs across the fully specified range.

There are all kinds of characterizations I wanted to perform but without much test equipment and particularly equipment setup for quadrature baseband evaluation, I decided to keep it simple and move on to a dual channel ADC I could use with this board.  A quick check of the input amplifier gain showed reasonable and expected performance.  The other quick test easily accomplished was checking the programmable filter response.  The simplest, albeit not quite so accurate, was to set a tuning value and filter cut off frequency and scan a RF tone about the tuning frequency.  I could then use a spectrum analyzer with max hold history on and get the outer envelope of the fundamental as it was sweep through the frequency range.  The following diagram captures those results.

Programmable Filter Response (See text for measurement approach and caveats)
The down side with this approach is that the second harmonic of the spectrum analyzer is higher than the filter roll off very far into the stop band.  What this translates into is frequencies far away from the pass band end up seeing a higher max hold value than actual due to the second harmonic pushing up the history value.  So basically once you get -30dB or more down in the response you cannot see the true roll off of the filter, rather something less which is pushed up due to harmonics in the measurement device when it is seeing the fundamental at lower frequencies.  The tuner is set to 975MHz with the input swept from this to +20MHz.  An attenuator is used at the analyzer input to keep the signal level low to minimize its harmonic responses.  The analyzer is a 50 ohm input on a single output IF channel with the other terminated in 50 ohms.

So in short, the tuner is working as expected and within my current measurement capabilities.  Further characterization will have to wait until I have a dual channel ADC. 

Tuesday, May 17, 2016

Prj145 - BBB LX9 C++ and VHDL (Part5)

With a FPGA board and JTAG tools to load an image, the next step is to actually develop and use a VHDL application.  Since much of the VHDL is custom to the daughter board I focused on trying to get a basic digital control interface (BDC) that I could use with different daughter cards.  This has the added benefit of testing out the BBB-Fboard SPI and providing another set of GPIO ports.  The effort includes both the C++ and VHDL sides.  The following is a conceptual organization of the software.
C++ and VHDL Organization
The Fboard software primarily provides an interface to read and write the SPI interface to the Fboard hardware.  There are different flavors of SPI access including 8 and 16 bit accesses.  A device tree is installed and determines how the SPI interface is access – via the BBB sysfs interface or a PRU0 image.  The PRU0 is significantly faster and drives the SPI at multiple MHz while the sysfs version is on the order of kilohertz and is easier to start debug and test with.  The intent is that PRU1 can be used to issue SPI commands to PRU0 for application specific streaming of high speed data.

The basic digital control block (BDC) can be included in any FPGA image and provides a register interface via the SPI link between the BBB and the Fboard.  The control block is based on previous efforts and looks a lot like the Prj 141 digital interface.  The difference is that I wanted to simplify the programming model and allow each access to specify the register rather than needing a register access to control a selector for the next access to get to the target register.  A basic block diagram is below.
VHDL ControlBlock Implementing Basic Digital Control Interface
Each control block contains a set of 8 bit read/write registers, some 16 bit counters, and two GPIO units.  Registers/counters can be added with each bit or set of bits interfacing with other application specific VHDL logic. Each GPIO unit has three registers (one for input, output, and direction of pins). The GPIO units connect directly with each of the 10 pin ports on the Fboard. Each of these registers has its data and read/write valid signals muxed based on a register id from a finite state machine.
This state machine conducts the read/write of a register based on an SPI command.  The finite state machine has a register fifo with a 16 bit SPI slave.  The state machine to process SPI commands operates at 96MHz (8x the XO on the F board using a DCM) and takes multiple clocks to read or write a register.

The SPI interface timing is specified to support a 20MHz SPI clock (for FPGA timing and layout only, the BBB even with the PRU will not exceed 16MHz with 9MHz being more realistic).  The software is designed to clock 16 bits into the SPI register at the same time 16 bits are clocked out.  The format of a SPI word is below.
         -- Bit position
         -- 1111 1100 0000 0000
         -- 5432 1098 7654 3210
         -- RWNN NNNN VVVV VVVV
         -- Where:
         -- R = 1 => read
         -- W = 1 => write
         -- N = register selector
         -- V = 8 bit write value
The upper two bits select whether a read or write is conducted.  In both cases the next 6 bits identify the register to be operated on.  In the case of writes, the following 8 bits of value are written to the specified register.  In the case of a read, these 8 bits are ignored, and the specified register is read (16bits) and saved in an internal register.  On the next SPI access, while the new commands bits are being shifted in these 16 bits are being shifted out.

This might seem a bit counter intuitive, however, it keeps the finite state machine simple and extensible, focuses on single operation writes (8 bits of which is sufficient for my purposes) and allows streaming 16 bit reads with little overhead (e.g. for ADCs).  It also allows 2^6 = 64 registers to be defined. The first 15 registers are dedicated to common operations like: a) FPGA image and version identification, b) Debug counters, c) LED control/signaling, and d) the two GPIO ports.  The second set of 15 registers is dedicated to the application specific basics (for example the DDS frequency value).  A 16 bit read fifo is used as the last register available.

Related:

Saturday, April 30, 2016

Prj145 - BBB LX9 JTAG Boundary Scan Utilities (Part 4)

Previous posts walked through the overview, schematic, and fabrication of a fpga board for mixed signal use with a Beaglebone Black.  This post summarizes the boundary scan tools used to load the fpga image.

The JTAG boundary scan (JTAG for short here after) along with the DONE, INIT, and PROGRAM_B pins are accessible via BBB GPIO pins.  There are two utilities used to provide key functionality with these pins.
JTAG Boundary Scan Tools for Beagelbone Black LX9 Board
The first is Fxvc which is a virtual cable daemon based on software from Xilinx and tmbinc. This utility allows the Xilinx tool set to program and interrogate the fpga without a hardware cable. Within iMPACT, you select a loadable module under cable setup and supply:

xilinx_xvc host=192.168.0.2:2542 disableversioncheck=true

The tool then uses the network connection to conduct all JTAG operations.  The code is factored into two components: the general server which handles network transactions and the board specific portion which turns JTAG operations into pin level settings.  The tool was developed using a JTAG device simulator and was actually a really insightful exercise in understanding JTAG boundary scan.

The second tool is Fxsvf which is an embedded SVF player.  SVF is a way to express JTAG operations in a text file, while XSVF is a Xilinx binary form of SVF which results in more compact files.  Again, the application is broken into two components, a portion which handles the reading and parsing of an XSVF file and a portion which is board specific and sets the JTAG pins appropriately. The general part comes from the Xilinx XAPP058.  Due to licensing, this portion is currently not open source and can only be obtained by registering with Xilinx.  For this reason, the general XSVF player portion is treated as an installed library that you link the board specific pin manipulation code against to produce the final application.

One of the down sides to the current approach is the performance.  Manipulating GPIO pins from user space with the sysfs interface is quite slow (but simple).  I knew this going in but underestimated the convenience of being able to just attach to the fpga JTAG interface by running an application and having an Ethernet connected (which is always the case for my BBB work).  Not having to drag out yet another cable is really nice.

One of the issues I encountered was getting the ISE 14.7 tools to properly program the flash.  This process is what Xilinx calls indirect programming.  It involves loading a fpga image via JTAG that can manipulate the flash SPI pins via the JTAG interface.  This would not work for me.  At first I suspected a problem with my layout of the flash SPI, then I suspected a fabrication error, then I investigated Fxvc errors.  Eventually I ended developing my own utility Fflash to access the SPI flash and found no problems.  I found a couple of data points indicating the ISE tools sometimes have issues with SPI flash access (e.g. my identical issue - ID check failing, however, the workaround failed to solve my problem).  Given this along with the support state of ISE, I decided to abandon this approach and just work with my own flashing utility.  This is less of an issue than I first thought since my general use model is to load an FPGA image with iMPACT while debugging and then once the image is finalized save a copy and flash it.

The process involves first loading via Fxsvf a fpga image which directly connects the host SPI pins to the flash SPI pins. The Fflash utility then programs the SPI flash using the host SPI lines.  When using the PRU interface to the host SPI pins this is extremely fast - about 3 seconds to erase the device (device limited) and less than a second to program and verify the image into the flash.

Thursday, April 14, 2016

Prj 145 - BBB LX9 FPGA Board Construction (Part 3)

This post captures a few notes on the fabrication of a BeagleBone Black minimal FPGA board. Previous posts covered the block diagram and schematic.  The board is a 2 layer OSH Park order at roughly 3" x 2".  One of the differences with this board is that I used 0603 resistors and capacitors for density reasons rather than 0805's I normally use.  I have used these in the past in a limited capacity.  The mechanics of mounting these are no different, however, I did find that the smaller parts slowed me down.  In the end, I think it was worth it as there were a couple of places where the 0805's would have made the board layout more difficult. Beyond the this, and the TQG package, there is nothing too challenging about this build.
Spartan 6 LX9 Board.

LX9 Board Mounted to Beagel Bone Black
This was the first time I had used a TQG package, so I was a little nervous about how it would turn out.  I have gotten reasonably good at working with 0.5mm pitch QFNs but only in the 40 pin range. Airgunning the QFNs works really well and they self align nicely if you get the solder paste application right.  I only have a jewelers loop not a microscope so manual alignment was a concern. My attempt on the first version of this board used an airgun.  This was not a good idea.  The problem with this is the shear area - its 22mm X 22mm.  It took forever to get the paste to melt and I had a hard time evenly distributing the hot air around the perimeter of the part.  There are hoods for air guns (which I do not have).  The board above used manual placement with a soldering iron.  I tacked down a pin on one corner, inspected, and then tacked down a pin on the opposing corner.  This was followed up with running a solder bead down each side and then wicking off the excess solder (you can see the flux residue from this around the part).  This worked out extremely well and was simple to do.  The picture below captures a closeup of the end result.
Closeup of hand soldering and alignment of TQG-144.
The only issue with the technique is that if too much solder is applied it tends to walk up the knee of the pins where it creates shorts with adjacent pins.  This high in the knee makes it difficult to wick off. I found that inspecting all of the pins from three different angles (front on, top angled left, and top angled right) allowed me to catch all instances of this.

Related:
Prj 145 - Beagle Bone Black Simple LX9 FPGA board (Part 1)
Prj 145 - BBB LX9 FPGA Board Design (Part 2)
Prj 145 - BBB LX9 FPGA Board Construction (Part 3)
Prj 145 - BBB LX9 JTAG Boundary Scan Utilities (Part 4)
Prj 145 - BBB LX9 C++ and VHDL (Part5)