One of the open questions which I had no data on was the FFT performance of the ARM processor. The Si (spectral investigation) application was designed to conduct all processing except display on the BBB. This allows for thin java clients for control and presentation only (e.g. tablet or phone). The FFT performance is an important aspect of spectral evaluation in the application (at this point I haven’t moved to poly-phase filter but wanted to focus on FFT based processing as a start). The original work used a simple FFT in C from a reference text. This approach was intended to be instructive, not high performance. This was then updated to use the FFTW package. The table below captures the measured FFT performance with a magnitude squared calculation on the ARM comparing both implementations. Note: These are double implementation FFTs (not integer – which will be evaluated later if need be, wanted to start simple), also the mag squared operation appears to take a very small fraction of the time.
FFT Size
|
Time (uS) Reference
|
Time(uS) FFTW
|
256
|
907
|
557
|
512
|
2069
|
777
|
1024
|
4978
|
1619
|
2048
|
10156
|
4128
|
4096
|
21990
|
9143
|
8192
|
49383
|
20219
|
16384
|
116065
|
45883
|
The impetus for focusing on this metric is that in a
spectrum analyzer like application, one of the key metrics is the refresh rate
at a given frequency span. Based on the
hardware at hand this translates into frequency stepping speed. The driving aspect of this is the collection
of samples. Based on previous noise
measurements, a good starting point seems to be around an 8k FFT. At 1MSPS the collection of 8k samples will
require 8mS [i.e. (8E3 sample)/(1E6sample/sec)=8E-3 seconds ]. Sample collection can be overlapped with
power spectrum estimate calculation (i.e. FFT) and transmission of results. So the bottom line is the target is 8mS per
8k FFT which is not being met based on the data above. There are a couple of options including
switching to an alternate power spectrum estimate technique or evaluating
integer FFT performance. (FFT3.3.3 includes ARM NEON support). This will be deferred until further hardware
characterization is complete. An interim
target of ~30 steps per second appears readily achievable which if we use
250kHz per step yields a sweep rate of 7.5MHz per second.
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.