2015
04.27

Yay, I’ve killed all nightly builds. Sorry ūüėČ

That was the short version. Last weekend I was busy with removing some legal hacks from AROS sources. The hack on the schedule was commonly used ThisTask pointer in the SysBase. Now, at least in my local branch of AROS for RaspberryPi the SysBase->ThisTask points to a nirvana place where all code is either happy crashing, or dead, or both. ThisTask points to NULL :)

No, it didn’t disappeared completely. The ThisTask pointer has been moved (and is used there) to something similar to a thread local storage. It is local, but not local for a thread. It is local to a CPU core. On RPi2 we use four independent local storages and each of them has it’s own ThisTask pointer. Don’t hold your breath, it’s not SMP yet. Far from it :) The scheduler works only on the CPU#0. At least for now.

The TLS is used exclusively by the kernel.resource, which knows best about the low-level part of the system. Exec has become two new architecture-specific macros, named GET_THIS_TASK and SET_THIS_TASK(x). On all architectures they do expand to SysBase->ThisTask, on RaspberryPi they expand to TLS_GET(ThisTask) and equivalent TLS_SET. What about the rest of the AROS code? Well, in that case the only sane way to get ThisTask shall be used — the FindTask(NULL) call.

And here we come to the point where I’ve killed all nightlies. During my ThisTask removal fun I broke¬†accidentally one macro in AROSTCP network stack :) It should be fixed already.

2015
04.21

Porting AROS to RaspberryPi is a lot of fun, I told that already. There’s also a lot of frustration and You know that. This time because of 4 CPU cores…

From very beginning I have noticed that the speed of frame buffer was relatively slow. At least not as fast as I would expect form a nearly 1 GHz machine. Well, issue there, ignored first. I followed with AROS porting and came to a point where AROS was booting into desktop and running programs. As a simple example I have added Clock to WBStartup folder, thus making this app start automatically once the system is up. Of course I have had full debug enabled in screen console and over serial port.

Huh, it took AROS nearly 30 seconds to boot. Not bad, but could be better for sure. Slow redrawing od the screen was worrying me but hey, we do have the simplest graphics driver ever. No acceleration, just a simple portion of memory filled pixel by pixel (with some help of our base graphics class of course). So far so good.

IMG_3069

Then out of curiosity I decided to take a look at an old raspberry pi model I have on my desktop. I booted it and looked on the Clock and gone mad. Old raspberry pi with arm11 CPU booted in about 20 seconds. 2/3 of RaspberryPi2 speed! Can’t be, I thought. The new machine cannot be that bad, can it? Have I missed some cache setup? Frame buffer can’t be cached, right? Why was linux frame buffer console faster?

Finally I found a forum where Bare Metal guys were discussing their great efforts to develop standalone software for RaspberryPi. Luckily for me one of them had similar issue I had. He also led me to the final solution. It turned out, that the CPU cores of RaspberryPi2 are not silently seeping and waiting for an interrupt when start.elf transfers the control over to the ARM cpu. No, instead they are busy looping and polling the registers, anxiously waiting to start and do some useful work. As you can imagine polling technique is not something very effective, it’s rather the contrary. The additional CPU cores were stealing the precious bus cycles, leaving less for the CPU#0 which was actually running AROS code. Eureka!

There are two solutions and I have found both of them working with AROS. The first one is to extend the config.txt file (the file which is read and parsed by VideoCore). There, one has to add following parameter

 arm_control=0x1000

It forces the additional CPUs to go sleep and wait for interrupts instead of do busy looping. I tested it and it really helped. After adding that line AROS really flies on that tiny computer! Frame buffer refreshes quickly, display redraws quickly, few demos redraw their windows nearly immediately. Boo! Now the machine not only feels faster than old RPi, it actually is faster.

Letting the additional CPUs to sleep alone is good, but not something I liked very much. Sure, start.elf does good job but I wanted to make AROS do that job. So I started to code :) I wrote small assembly routine, a trampoline which initializes caches and MMU of the woken up core. The trampoline initializes also the supervisor stack and jumps to a routine in C code. At the moment the C routine is rather simple. It checks CPU type, enables VFP and enters endless wait-for-interrupt loop. Ah, the C routine babbles on the system log of course to let me know it is actually working. What I got was:

[KRN] Co]e o Co eUp ani idiwir igr rrutatuots
s
0a008

Uh. Not very readable. Forgot something? Ah yes, there is no locking in our bug() function, which means all cores were fighting on the serial line. Proper locking will come later, since it has to be done right, for now I have only added some delays. This is how it looks now

Bildschirmfoto 2015-04-21 um 21.53.40

Please note that the “Core x up and waiting” lines are sent to the console respectively by different ARM cores. It’s not SMP, not even AMP. It’s just small initialization routine. But at least it work as expected…

And with current setup AROS really flies on the RaspberryPi 2 ūüėÄ

 

2015
04.18

Raspberry Pi

Eons ago I was involved in several ARM-related projects. One of them was to make a linux-hosted port of AROS for ARM devices. These were the days full of fun and joy (if everything worked well) and frustration (if everything failed). After that my engagement in AROS dropped nearly to zero. There were, of course, some exceptions like improvements in memory management (TLSF support) or improvements in x86_64 AROS. But none of them were as low-level as I wished them to be.

Since at work we started to use some ARM-based embedded machines for our electronics, I had some fun with coding them. Not really low level, but weird enough :) This all drove me to an idea of buying an ARM platform and make native AROS for it.

IMG_3049 Kopie

Even if there are better machines available, I have decided to support RaspberryPi. One of the reasons was availability of the rPi code in AROS repository Рour great developer Nick Andrews has started a port of Aros for that machines already and made a great progress with it. Another reason, a very important one, is a huge community behind Raspberry.

So, the board, the RaspberryPi 2, has been bought :)

IMG_3003

During last weeks me and Nick had fun with bringing AROS port back into usable state, rewriting it and improving in many places. Code which was initially not working with rPi2 boards at all now boots equally good (or equally bad) on both rPi and rPi2 into Wanderer, the desktop environment of AROS. The kernel of our system is loaded at a virtual address 0xf8000000. The read-only portion of the kernel is MMU-protected again writes. All caches and write buffers are enabled. Slowly all bits and pieces are improved and we are doing our best to get USB on-the-go up and running. Having it would allow us to actually use Aros on these nice machines already.

Meanwhile, I’m completing our small EABI library for ARM cpus so that we could build entire AROS with gcc5 compiler. Well, fun :)

2015
04.10

Reboot

Over two years passed since last entry on this page — two years only but it feels like eons. I think it’s time to reactivate this blog :)

 

So, reboot…

2012
07.06

I think I will never understand that

Today morning I was reviewing some small bit of code, which surprisingly compiled on i386 target just fine, but failed for ARM target. As always, the first thing I though was “Oh no! That could be variadic function!” and I was right, again.

But this time I was really surprised. The author of the code started just right fine:

#include <stdarg.h>
[...]

char * STDARGS GetKeyWord(int value, char *def, ...)
{
    [...]
    va_list va;
    [...]
    va_start(va, def);

And then, out of sudden, the motivation for using stdarg passes away, va is casted to a LONG * type and varargs handled manually. Why oh why? Why the coder uses tons of casting, where he could use a simple va_arg? Why string = *((char **) args) instead of string=va_arg(va, char *)? Why advancing the args pointer? Where is the missing va_end? I don’t know and I think I will never understand that.

2011
12.22

Merry X-mas

Merry Christmas to You all out there! :)

And sorry for disappointing many of You during this year. Some of you hoped this year I will do something nice for them. And I failed many times. I would really like to tell you that I’m really sorry about that. I feel really bad about it.

Stephen, I feel really sorry that I failed and didn’t gave you the promised enchantments to AROS. I do know you were really disappointed with (lacking) results of my work even if you never said that. Thanks for everything you did to AROS.

ACube, I feel really sorry that I cannot give you any good news about progress I made. There is no progress. I’m sorry but work and real life are eating all the spare time I could have for you.

Nikos, I feel awfully that I cannot give you the X-mas gift – overlay for intel GMA. I really wanted to but, once again, I failed. I failed. Forgive me. Keep good work on supporting AROS.

sorry guys…