A usual rant goes towards ST guys for their mindless design*. I don’t really know anyone, who does some heavy app development with no serial terminal for debugging (Or may be I don’t know many of them?). You know, gdb is good, but a good old ‘dmesg’-like stuff is usually even more helpful.
Anyway, while other people are trying to discover traces of sanity of the ST people by reversing STLinkv2 and discovering only huge holes in security so far, I decided to go a different way that works just fine with STLinkV1 and STLinkV2.
My first idea was to stuff the VCP example into the stlink’s uC (which is an STM32F103C8T6) and throw a little wires, but in the end – I didn’t want to ditch STLink completely (It helped me out a few times). Ideas? Sure!
First step. What does STLink do? Right, apart from that breakpoint/step voodoo it writes and reads memory. Sounds good? Good! Enough to do pretty much anything.
The plan:
- Place a struct with a known magic (Say, 0xdeadf00d) somewhere in sram of the MCU
- Scan sram with STLink and find the struct location
- Use two buffers within the struct to perform IO
- …
- PROFIT!
Why? Because we can get a proper control of our board via USB without any need for FTDI dongles, have less mess-o-wires, and still have a reasonable transfer rate. And we can later hook it to newlib and have normal printf send out stuff there.
Okay, now time to do some code. To make it work properly we need to know a little theory. The biggest problem here, is that both debugger and ARM core will be reading the very same memory. And little to no means of synchronization are available for us.
First, we have to understand, that all our operations will be eventually send via AXI transactions, and we can only safely assume that any write of 32/16/8 bits or read is atomic. The AXI will serialize these for us. But, we can’t do anything else. Making a read and then a write to the same place opens up a small, but possible window for a race condition, so we can’t do ‘check if not busy, then set busy’. With that in mind, we have two algorithms for this to work reliably. First – two ring buffers with messages posted on them (seriously, an overkill), or a simpler way with just two buffers, and their byte count. In other words
struct stlinky { uint32_t magic; /* [3:0] */ unsigned char bufsize; /* 4 */ char txsize; /* 5 */ char rxsize; /* 6 */ char reserved; /* 7 */ char txbuf[CONFIG_LIB_STLINKY_BSIZE]; char rxbuf[CONFIG_LIB_STLINKY_BSIZE]; } __attribute__ ((packed));; |
The ‘packed’ makes sure compiler won’t insert any gaps for us.
So, the STLINK searches the sram for the magic, finds out buffer sizes. Whenever a txsize is non-zero the host can read out the txbuf, and then set txsize to 0 so that MCU will know that it can send more data.
Sounds cool? Anyway, I even got a working proof-of-concept that worked awesomely after being wrapped for newlib _write/_read.
See this header and this source file.
Or grab the whole Antares Buildsystem (it’s almost stable and nearing the release anyway) and see the example stlinky project
Oh, and you will also need my fork of stlink, that will give you the st-term utility, as seen above.
Looks like texane has just merged my pull request into his tree, so st-term should now be in upstream stlink!
* If somehow, people from ST will end up reading this, don’t get mad at me, I really like your MCUs. But I really hope that this will motivate you to make the next stlink a composite usb device with a proper serial port avaliable and clean up the mess with periph libs. You know, it’s kind of frustrating when official STM32F1X periph libs package has a 3.5.X version, and USB FS Kit has a 3.6.x bundled.
UPDATE: A simple benchmark reports about 16129 bytes per second transfer rate with 64 byte buffers. Even more, than I have expected!
UPDATE2: As suggested in comments, there is ‘arm semihosting’ that does pretty much the same thing. I will make a follow-up post with the benchmark results for it and setup instructions, but so far it looks more complex in terms of embedding it in a host application.
ST people are weird, that’s no doubt. But ARM people have a technology to enable console I/O (and more) since a few years: semihosting. It works just fine with stock stlink+OpenOCD and support for BlackMagic Probe (btw, why would anyone use stlink when there’s BMP you can flash onto the same board?) was implemented recently.
Basically, all one has to do is to send “arm semihosting enable” command to OpenOCD, link his firmware against -lrdimon (from newlib) and call initialise_monitor_handles() on start.
@Paul Fertser: Thanks for the info. BlackMagic probe is something I’ve missed and it looks that I should give it a try some time soon.
Semihosting looks quite promising as well. Wonder how fast it is. I thought of just using stlink as a cheap and dirty interface to a usbless stm32vl-discovery to the host system that will feed GCODE to the MCU. Right now I’m getting something around 10Kb/sec bandwidth with this hack, which is more or less sufficient for what I’m doing.
Basically, semihosting is a software breakpoint used as a syscall. So it’s as fast as debugger can detect the condition and fetch from the target SRAM, I’d say few times per second. But since the buffer size is arbitrary (one can fine-tune it with setvbuf()), that shouldn’t be an issue. On a bonus side, you can write to several different host files directly from the target etc. Adding this feature to texane/stlink would probably be a cleaner and more universal solution.
But why not use OpenOCD? It’s really easy to setup nowadays, no need to write lengthy config files anymore, everything works out of the box.
And if you decide to mess with the wires, BMP already exposes debugger’s UART (for the stlink target it’s currently UART2) via standard ACM interface, and requires no host firmware at all (apart from gdb itself).
One of the reasons was the fact that openocd is kind of big, while I can stuff st-term on a router with openwrt with no extroot. Anyway now I’m curious to get openocd, bmp and semihosting running and see how it works and make some benchmarks.
You mention the upstream repository … is there a URL for that? You say the terminal capability has been incorporated … could you show an example?
This page is the first hit in a search for st-term … can you consider those readers who aren’t deeply knowledgeable about the surrounding topics and spell it out a bit more? For example, what versions of STLINK have the st-term? Does it require anything else such as openocd?
Thanks,
Chris