It might be easier to (1) calculate both the start offset and the end offset, then (2) truncate the start offset to the cache-line size, (3) round the end offset to the cache-line size, (4) calculate the size argument by subtracting the start from the end offset.
Regarding (2), when calculating the end offset via surface.w()*rect.y2()
- rect.x2(), it will refer to the pixel at the lower-right corner of the
dirty rectangle. You have to add 1 to this offset value, so that the subsequent subtraction yields the number of covered pixels, not the difference. E.g., think of a rectangle of only one pixel where p1 == p2. Here we want to flush the cache line around this pixel. The difference between the start and end offsets would be zero. But the number of covered pixels is one.
I have tried with 16,32 and 64. Same result. Movement y wise makes traces with mouse and bigger artifacts for windows. These sorts of things are really my weak spot.
int a=0; int r=0; a=64; Genode::addr_t invalid = (rect.x1()+(rect.y1()*(size.w())))*sizeof(Pixel); log("unaligned invalid: ",invalid);
invalid = invalid&~63;
Genode::addr_t endoffset = (1+(size.w()*rect.y2())+ rect.x2())*sizeof(Pixel); log("endoffset: ",endoffset);
r = endoffset%a; endoffset = r? endoffset + (a - r) : endoffset; log("invalid: ",invalid); log("invalidz: ",endoffset);
Genode::addr_t xxx = (Genode::addr_t)surface.addr();
cache_clean_invalidate_data(xxx+invalid,endoffset-invalid);