Saturday, February 20, 2016

AdcHttpd - HTML5/Browser Based Signal Analysis Tool

As part of the X board/DDC efforts I decided to redo the adc tool I was using.  The original set used a java application on the PC and a server on the BBB.  The java application provided controls and presentation of the data and processing of the samples.  This worked fine but each time I went to take a measurement I needed the PC application.  There also seemed to be a lot of interface work in both the java client and on the server every time I wanted to make a small change (e.g. particularly for controlling the acquisition and hardware configuration).  Previous experiments with javascript/browsers indicated that network and processor performance and loading would be sufficient to support the intended use.  There are two advantages to this approach in my mind: a) there is no client application, just a browser and b) the interfacing software is simpler and better localized by using JSON which provides good support for serialize/deserialization with javascript software and is easy to produce/parse within a C++ application.

The following image shows the browser presentation.
Chrome with AdcHttpd on a BeagleBone Black with 16 bit DDC data from a LX9

The main area is a HTML5 canvas which is drawn using a basic graphing toolkit I developed for instrumentation purposes.  The buttons at the bottom are HTML buttons with javascript actions.  The first row allows presentation changes (envelope, peak picking, user markers, storing a memory, changing Y limits).  The second row controls server side parameters such as the function to perform, in this case PSE or power spectral estimation, the number of points to produce, the number of averages to use, the channel to select and the LO frequency used in down conversion.  The channel is the tap point referenced in the DDC.  The final row sets the state to run or not run, or resets  various pieces of software.  The line above the graph also includes key parameters in effect along with the date and status of the connection to the http server (it turns red and indicates not connected if a threshold of failed server requests is reached).

The numeric quantities that are free form input (e.g. F1(Hz) or LO mixer or the upper and lower graph limits) are entered via a javascript keypad.  The following screen capture shows the same channel or signal as the previous capture but this time in the time domain and with the keypad popped up to modify the Y limits.
Javascript Keypad Input (Time series in background of signal in previous figure)
The browser tab consumes about 10% of a CPU core on an older PC (AMD 1.8GHz quad core) and about 1% load on the Ethernet interface.  The BeagleBone is at 98% running the http server.  The BBB limits the update rate based on its ability to process the samples (fftw for ARM is used for best performance).

There are several components to the internals of the browser code, http server code, hardware model, and device code on the BeagleBone Black.  The basic model is that the server is the definitive controller of state and operations while the browser code simply requests changes and conducts presentation.  At the highest level the following sequence diagram summarizes the interactions of the application.

Browser-AdcHttpd Summary Sequence Diagram 
The Browser lifeline captures the javascript code behavior executing within the browser while the AdcHttpd lifeline captures the software executing on the BeagleBone Black.  There are only two events handled by the javascript code: a timer and user input.  The timer forces a sequence of events which boil down to getting state and data from the server and drawing a plot of the current data.  All information is obtained from the server via http GET requests.  The results coming from the server are JSON text representing application variables.  The diagram is slightly off in that the responses coming back are processed asynchronously to the request.  There is additional logic to keep the state requests at a lower request rate than the requests for data.  In addition, there is logic to adjust the polling rate based upon the operational state of the instrument.  Quantities in the overall state include things like channel number to process, run/stop state, mixer frequency.  The data coming from the server is just a x,y list of points to be plotted on the graph.

The other events processed by the browser javascript are user inputs.  This ends up taking two forms: a) those which change server state and are sent directly to the server, and b) those which change local browser application state and can be processed without interaction with the server.  Examples of the former include run/stop state or channel number and examples of the later include graphing x and y limits, peak picking enable/disable, and markers on/off.

The server is more involved and is responsible for not only acquiring the data and processing it but also presenting it in JSON format to the client.  The following sequence diagram summarizes the key server components and interactions.

Summary Sequence Diagram of AdcHttpd Internals
There are only two autonomous lifelines here by design.  The first is the http server thread that interacts directly with the client and processes the http GET requests.  The second is the hardware model thread which drives processing of data and hardware interactions.  The only point the two threads interact is the hardware data model.  The interactions are designed to require minimal locking and interaction between the two autonomous threads.  This keeps things conceptually simple and decouples the browser client and its interactions and requests from the core hardware servicing and operations.  The hardware object model contains the current and desired operation parameters along with the most recent XY data to be presented in the browser application.

The hardware model thread checks various operating parameters based on their current value in the memory shared with the http thread.  If changes have occurred, the hardware model thread calls the appropriate device object methods to change the operating parameters.  Examples include which channel is being processed/exported by the hardware and the frequency of the downconversion local oscillator.  After a check on current parameters the hardware model thread invokes the ProcessCoherentInterval() on one of the processing objects.  There are processing objects for each of the main types of processing: power spectral estimation, time series, and histogram.  The product of processing a coherent interval is a set of XY points which is updated in the hardware model (and can then be accessed by the remote client via the main http thread). The hardware model thread continually loops on this processing until a reset is conducted.

Each of the processing objects deals with a coherent interval a little differently.  The key one is the power spectral estimation processing.  Since the data rates of the selected ADC channel can exceed the processing capability (indeed the network transport capacity), a processing interval is treated as a flush of current ADC data followed by the acquisition of one or more sets of 2k samples.  The collected data is treated as a coherent set and in the case of spectral estimation has a window applied along with an FFT and magnitude and normalization applied to produce the XY results.  In the case of low sample rate channels, 16k or more samples can be treated coherently since successive Get2k calls retrieves contiguously sampled data.  For streams with high bandwidth, the user simply has to not configure power spectra estimates greater than 2k (if you do you just get phase discontinuities across 2k sample sets and see spectrum broadening).

The device object encapsulates different boards.  There are a common set of C++ ADC and Mixer interfaces required that each board exports.  The startup method for the device object selects which boards and interfacing software to use.  This can be done by configuration file, code  changes or reasoning on which device tree(s) are currently installed.

Tuesday, February 9, 2016

Prj 141 - DDC at 60MSPS (Part 8 - Final)

This if the final post walking through a digital down converter and ADC on a Beaglebone Black (Part1, Part2Part3Part4Part5Part6  and Part7).

All of the previous work was conducted with a 10MSPS clock and ADC.  The next step was to increase to 60MSPS.  The only changes required were updates to the DDC filters.  Due to the increased decimation some of the internal paths experienced bit growth and needed to be widened.  The internal test pattern generation capability within the fpga image allowed this to be worked through all of these changes.

Everything worked fine until an actual signal was applied.  At that point I had problems with digital values being read twice.  This shifts the IQ values by one every now and then which destroys the quadrature relationship. The transfer rate across the IDC was 100k samples per second with each sample being 16 bits.  The transfer rate (or down converted bandwidth) was dropped below 50kSPS which still did not fully rectify the problem.  Based on numerous experiments and trial and error I think the single ground pin (looking seen in schematics in this post) is causing problems with the SPI port voltage level sensing.  Using a single ground is a bit much to ask, I’m using the FPGA board in ways it was not designed for.  I believe that as the digital pins begin switching (changing ADC values) the current required through the ground pin increases.  A small residual resistance on the ground return with a large current can push the ground reference at the FPGA up.  This results in a lower voltage across the SPI digital inputs (i.e. between the IO pin input from the SPI clock or slave select to the FPGA ground reference value).  The final confirmation of this was getting error free results with the internally generated test pattern with an open analog input but the same setup began generating errors when a simple wire antenna was attached to the input.  I contemplated using a differential converter at the SPI headers but decided against it as I suspected additional problems at this sample rate and a single ground through 0.1 headers.

This project was a great non-trivial introduction to VHDL and could be broken up into purchased pieces (XuLA2) and simple unique analog designs (ADC board).  At this point I'm going to have to contemplate my options for alternate approaches.

Prj141 Schematic
Prj141 Overview
Prj141 Digital Down Converter
Prj141 Digital Interface
Prj141 Software
Prj141 Filter Design
Prj141 Filter Evaluation
Prj141 LX9 Utilization
Prj141 Higher Sampling Rates

Thursday, February 4, 2016

Prj 141 - DDC LX9 Utilization (Part 7)

As this was my first FPGA I had no feeling for what would and would not fit within an LX9. The design (see Part1, Part2Part3Part4Part5 and Part6 ) takes roughly 40% of the slices in an LX9 with 17% of the slice registers and 28% of the slice LUTs being used. A single DCM is used for internal logic resulting in a 25% utilization and only 4 of 16 DSP48’s are used for a 25% DSP utilization. A full report from ISE 14.7 is attached below.

Device Utilization Summary[-]
Slice Logic UtilizationUsedAvailableUtilizationNote(s)
Number of Slice Registers 2,037 11,440 17%
    Number used as Flip Flops 2,037
    Number used as Latches 0
    Number used as Latch-thrus 0
    Number used as AND/OR logics 0
Number of Slice LUTs 1,640 5,720 28%
    Number used as logic 1,355 5,720 23%
        Number using O6 output only 719
        Number using O5 output only 98
        Number using O5 and O6 538
        Number used as ROM 0
    Number used as Memory 126 1,440 8%
        Number used as Dual Port RAM 0
        Number used as Single Port RAM 0
        Number used as Shift Register 126
            Number using O6 output only 8
            Number using O5 output only 0
            Number using O5 and O6 118
    Number used exclusively as route-thrus 159
        Number with same-slice register load 151
        Number with same-slice carry load 8
        Number with other load 0
Number of occupied Slices 577 1,430 40%
Number of MUXCYs used 704 2,860 24%
Number of LUT Flip Flop pairs used 1,931
    Number with an unused Flip Flop 288 1,931 14%
    Number with an unused LUT 291 1,931 15%
    Number of fully used LUT-FF pairs 1,352 1,931 70%
    Number of unique control sets 66
    Number of slice register sites lost
        to control set restrictions
175 11,440 1%
Number of bonded IOBs 25 186 13%
    Number of LOCed IOBs 25 25 100%
Number of RAMB16BWERs 5 32 15%
Number of RAMB8BWERs 2 64 3%
Number of BUFIO2/BUFIO2_2CLKs 0 32 0%
Number of BUFIO2FB/BUFIO2FB_2CLKs 0 32 0%
Number of BUFG/BUFGMUXs 3 16 18%
    Number used as BUFGs 3
    Number used as BUFGMUX 0
Number of DCM/DCM_CLKGENs 1 4 25%
    Number used as DCMs 0
    Number used as DCM_CLKGENs 1
Number of ILOGIC2/ISERDES2s 0 200 0%
Number of IODELAY2/IODRP2/IODRP2_MCBs 0 200 0%
Number of OLOGIC2/OSERDES2s 0 200 0%
Number of BSCANs 0 4 0%
Number of BUFHs 0 128 0%
Number of BUFPLLs 0 8 0%
Number of BUFPLL_MCBs 0 4 0%
Number of DSP48A1s 4 16 25%
Number of ICAPs 0 1 0%
Number of MCBs 0 2 0%
Number of PCILOGICSEs 0 2 0%
Number of PLL_ADVs 0 2 0%
Number of PMVs 0 1 0%
Number of STARTUPs 0 1 0%
Number of SUSPEND_SYNCs 0 1 0%
Average Fanout of Non-Clock Nets 3.37

Prj141 Schematic
Prj141 Overview
Prj141 Digital Down Converter
Prj141 Digital Interface
Prj141 Software
Prj141 Filter Design
Prj141 Filter Evaluation
Prj141 LX9 Utilization
Prj141 Higher Sampling Rates