Table of Contents

The Problem

As you can see in the video above, the redraw rate of the screen is pretty slow. With default settings on the ESP8266 it took about 17 seconds to draw a single bitmap to the 320x480 screen.

I managed to improve this to around 2.5 seconds, as you can see in the video below:

The Research

I noticed that the clearScreen call from ugclib was a lot faster than any drawing I did with drawPixel. The question was: why? If we look at the implementation of the clearScreen function we see:

void ucg_ClearScreen(ucg_t *ucg)
{
  ucg_SetColor(ucg, 0, 0, 0, 0);
  ucg_SetMaxClipRange(ucg);
  ucg_DrawBox(ucg, 0, 0, ucg_GetWidth(ucg), ucg_GetHeight(ucg)); // HERE
  ucg_SetColor(ucg, 0, 255, 255, 255);
}

Three of four lines only do some setting of properties (ucg_SetColor, ucg_SetMaxClipRange), so the actual drawing is done by ucg_DrawBox, which means we need to dig deeper into this function:

void ucg_DrawBox(ucg_t *ucg, ucg_int_t x, ucg_int_t y, ucg_int_t w, ucg_int_t h)
{
  while( h > 0 )
  {
    ucg_DrawHLine(ucg, x, y, w); // HERE
    h--;
    y++;
  }  
}

As we can see, a box is drawn line by line (while( h > 0 ), …), at the position (x/y) with a length of w. Those parameters are passed to ucg_DrawHLine. The ucg_DrawHLine function is a wrapper for ucg_Draw90Line

void ucg_DrawHLine(ucg_t *ucg, ucg_int_t x, ucg_int_t y, ucg_int_t len)
{
  ucg_Draw90Line(ucg, x, y, len, 0, 0);
}

… which does this:

void ucg_Draw90Line(ucg_t *ucg, ucg_int_t x, ucg_int_t y, ucg_int_t len, ucg_int_t dir, ucg_int_t col_idx)
{
  ucg->arg.pixel.rgb.color[0] = ucg->arg.rgb[col_idx].color[0];
  ucg->arg.pixel.rgb.color[1] = ucg->arg.rgb[col_idx].color[1];
  ucg->arg.pixel.rgb.color[2] = ucg->arg.rgb[col_idx].color[2];
  ucg->arg.pixel.pos.x = x;
  ucg->arg.pixel.pos.y = y;
  ucg->arg.len = len;
  ucg->arg.dir = dir;
  ucg_DrawL90FXWithArg(ucg);
}

Again, we have a lot of preparation of the ucg->arg struct (of type _ucg_arg_t). We set select the color via the col_idx, previously set to 0, as well as the draw direction dir (also 0). Which means, we have to look deper, at ucg_DrawL90FXWithArg

void ucg_DrawL90FXWithArg(ucg_t *ucg)
{
  ucg->device_cb(ucg, UCG_MSG_DRAW_L90FX, &(ucg->arg));
}

… which is another wrapper that calls a callback function device_cb, with the message to do a UCG_MSG_DRAW_L90FX, using our ucg->arg struct (the address where our struct is to be precise: &(ucg-arg)).

Now, what is this call back function?

If you look at the initializer of our ucg instance, we see that we make use of a display specific constructor called Ucglib_ILI9486_18x320x480_HWSPI (from Ucglib.h), which passes the ucg_dev_ili9486_18x320x480 callback into the member dev_cb in the constructor which is set to ucg->device_cb when ucg.begin() is called.

So far so confusing :).

All this means we need to take a look at ucg_dev_ili9486_18x320x480 (found in Arduino/libraries/Ucglib/src/clib/ucg_dev_tft_320x480_ili9486.c):

ucg_int_t ucg_dev_ili9486_18x320x480(ucg_t *ucg, ucg_int_t msg, void *data)
{
  switch (msg)
  {
  case UCG_MSG_DEV_POWER_UP:
    /* 1. Call to the controller procedures to setup the com interface */
    if (ucg_dev_ic_ili9486_18(ucg, msg, data) == 0)
      return 0;

    /* 2. Send specific init sequence for this display module */
    ucg_com_SendCmdSeq(ucg, ucg_tft_320x480_ili9486_init_seq);

    return 1;

  case UCG_MSG_DEV_POWER_DOWN:
    /* let do power down by the conroller procedures */
    return ucg_dev_ic_ili9486_18(ucg, msg, data);

  case UCG_MSG_GET_DIMENSION:
    ((ucg_wh_t *)data)->w = 320;
    ((ucg_wh_t *)data)->h = 480;
    return 1;
  }

  /* all other messages are handled by the controller procedures */
  return ucg_dev_ic_ili9486_18(ucg, msg, data);
}

Without looking at the details, we see that UCG_MSG_DEV_POWER_UP, UCG_MSG_DEV_POWER_DOWN and UCG_MSG_GET_DIMENSION are the only mesages that are consumed by this function and the remaining, unhandled messages are passed to ucg_dev_ic_ili9486_18:

// skipped ...
  case UCG_MSG_DRAW_L90FX:
    ucg_handle_ili9486_l90fx(ucg);
    return 1;
// skipped ...

... `ucg_handle_ili9486_l90fx`:

```c++
ucg_int_t ucg_handle_ili9486_l90fx(ucg_t *ucg)
{
  uint8_t c[3];
  ucg_int_t tmp;
  if (ucg_clip_l90fx(ucg) != 0)
  {
    switch (ucg->arg.dir)
    {
    case 0:
      ucg_com_SendCmdSeq(ucg, ucg_ili9486_set_pos_dir0_seq);
      break;
    case 1:
      ucg_com_SendCmdSeq(ucg, ucg_ili9486_set_pos_dir1_seq);
      break;
    case 2:
      tmp = ucg->arg.pixel.pos.x;
      ucg->arg.pixel.pos.x = 319 - tmp;
      ucg_com_SendCmdSeq(ucg, ucg_ili9486_set_pos_dir2_seq);
      ucg->arg.pixel.pos.x = tmp;
      break;
    case 3:
    default:
      tmp = ucg->arg.pixel.pos.y;
      ucg->arg.pixel.pos.y = 479 - tmp;
      ucg_com_SendCmdSeq(ucg, ucg_ili9486_set_pos_dir3_seq);
      ucg->arg.pixel.pos.y = tmp;
      break;
    }
    c[0] = ucg->arg.pixel.rgb.color[0];
    c[1] = ucg->arg.pixel.rgb.color[1];
    c[2] = ucg->arg.pixel.rgb.color[2];
    ucg_com_SendRepeat3Bytes(ucg, ucg->arg.len, c);
    ucg_com_SetCSLineStatus(ucg, 1); /* disable chip */
    return 1;
  }
  return 0;
}

ucg->arg.dir was 0 so this actually boils down to (please notice my comments in the code):

uint8_t c[3];
// this is some intro magic ;)
ucg_com_SendCmdSeq(ucg, ucg_ili9486_set_pos_dir0_seq);
// prepare three bytes of color data
c[0] = ucg->arg.pixel.rgb.color[0];
c[1] = ucg->arg.pixel.rgb.color[1];
c[2] = ucg->arg.pixel.rgb.color[2];
// send the three bytes contained in c ucg-arg.len times
ucg_com_SendRepeat3Bytes(ucg, ucg->arg.len, c);
// this is some outro magic ;)
ucg_com_SetCSLineStatus(ucg, 1); /* disable chip */

Correct, as you can remember, the width w of a line was set to ucg->arg.len, and the function ucg_com_SendRepeat3Bytes sends the given byte array exactly w many times. Ok, the ucg_com_SendRepeat3Bytes is another wrapper to a callback that bubbles down the message UCG_COM_MSG_REPEAT_3_BYTES to the right location where the right protocol sends the byte across the line. Welcome to the world of programming, but this is exactly what you need to when you write a library that is supposed to work with a lot of displays and technologies with wich you can interface with the given displays.

Knowing that we use 4-wire-SPI the correct implementation of the ucg_com_SendRepeat3Bytes, for our case, is:

SPI.transfer(data[0]);
SPI.transfer(data[1]);
SPI.transfer(data[2]);

Thus, the lesson of all this digging was, to send pixels to an ILI9846 display we do the following:

#define NUMPIXELS 320*480
uint8_t pixels[NUMPIXELS]; // get this from somewhere
ucg.getUcg()->arg.pixel.pos.x = 0; // where we start to draw
ucg.getUcg()->arg.pixel.pos.y = 0;
ucg.getUcg()->arg.len = NUMPIXELS; // how many pixels
ucg.getUcg()->arg.dir = 0;
ucg_com_SendCmdSeq(ucg.getUcg(), ucg_ili9486_set_pos_dir0_seq); // init comms
for (uint16_t i = 0; i < NUMPIXELS; i++) {
    SPI.transfer(pixels[i]);
}
ucg_com_SetCSLineStatus(ucg.getUcg(), 1); // stop coms

So…..

The Implementation

First, we need to copy the array of magic bytes that we need later wich is defined in Ucglib/src/clib/ucg_dev_ic_ili9486.c, which we can’t import, which we need for the magic intro:

const ucg_pgm_uint8_t ucg_ili9486_set_pos_dir0_seq[] = {
    UCG_CS(0), /* enable chip */

    /* 0x008 horizontal increment (dir = 0) */
    /* 0x008 vertical increment (dir = 1) */
    /* 0x048 horizontal deccrement (dir = 2) */
    /* 0x088 vertical deccrement (dir = 3) */
    UCG_C11(0x036, 0x008),
    UCG_C10(0x02a), UCG_VARX(8, 0x01, 0), UCG_VARX(0, 0x0ff, 0), UCG_A2(0x001, 0x03f), /* set x position */
    UCG_C10(0x02b), UCG_VARY(8, 0x01, 0), UCG_VARY(0, 0x0ff, 0), UCG_A2(0x001, 0x0df), /* set y position */

    UCG_C10(0x02c), /* write to RAM */
    UCG_DATA(),     /* change to data mode */
    UCG_END()
};

Then, we add another file upload request handler to the server running on the ESP8266, that uses this style of updating the screen.

The handler:

void rgbApiFullScreen() {
  HTTPUpload& upload = server.upload();
  if (upload.status == UPLOAD_FILE_START) {
    hasImage = true;
    // initialize ucg data
    ucg.getUcg()->arg.pixel.pos.x = 0;
    ucg.getUcg()->arg.pixel.pos.y = 0;
    ucg.getUcg()->arg.len = 320 * 480;
    ucg.getUcg()->arg.dir = 0;
    // initialize communication with display a.k.a the magic intro
    ucg_com_SendCmdSeq(ucg.getUcg(), ucg_ili9486_set_pos_dir0_seq);
  } else if (upload.status == UPLOAD_FILE_WRITE) {
    // send data to the display directly
    for (uint16_t i = 0; i < upload.currentSize; i++) {
      SPI.transfer(upload.buf[i]);
    }
  } else if (upload.status == UPLOAD_FILE_END) {
    // end communication with display, a.k.a the magic outro
    ucg_com_SetCSLineStatus(ucg.getUcg(), 1);
    server.send(200);
  } else {
    server.send(500, "text/plain", "500: error");
  }
}

Registering the handler:

void setupServer() {
  server.on("/api/rgb", HTTP_POST, []() {
    parseUploadParams();
    server.send(200);
  }, rgbApi);
  server.on("/api/fullscreen", HTTP_POST, []() {
    server.send(200);
  }, rgbApiFullScreen);
  server.begin();
}

I’ve left the old api /api/rgb, as this supports partial refresh and placing of the recevied raw bitmap to anywhere on the screen, albeit slower. The new api is now called /api/fullscreen, as this only support receiving of fullscreen images of 320x480 pixels, but a lot faster.

The curl request now looks as follows:

url -i -X POST -F "data=@s1.rgb" "http://192.168.178.50/api/fullscreen?x=0&y=0&w=320&h=480"

The resulting code is: iot-screen-v2.ino.

Hint: I set the ESP8266 to 160Mhz and increased the SPI frequency to get a faster speed after the ucg.begin(...):

SPI.setFrequency(40000000);

Have fun.