====== gst1-rpicamsrc and uclibc-ng ======
What started off as a simple wish to install the Raspberry Pi camera plugin for GStreamer1 on Buildroot, became a week long saga of infinite trials and errors. Below are the conclusions:
===== The symptom =====
When trying to load the plugin, for example via the command:
gst-inspect-1.0 rpicamsrc
The load fails with a series of errors such as this:
/usr/libexec/gstreamer-1.0/gst-plugin-scanner: symbol 'mmal_port_pool_create': can't resolve symbol
At first these were symbols that were actually missing from the //libgstrpicamsrc.so// library, and for example adding the linker flag **-lmmal_component** would solve the problem. But the issue became vastly more complex when the unresolved symbols actually belonged to **libmmal_util.so** A file that is already linked to the library!
==== Background reading ====
These are the two best readings I have found for the subject:
* [[http://www.kaizou.org/2015/01/linux-libraries/|Better understanding Linux secondary dependencies solving with examples]]
* [[https://flameeyes.blog/2010/09/01/your-worst-enemy-undefined-symbols/|Your worst enemy: undefined symbols]].
===== Analysis =====
The first step was to figure out at what stage the undefined symbols are observed. The symbol **mmal_port_pool_create** is owned by //libmmal_util.so//. This is found using:
./host/bin/arm-buildroot-linux-uclibcgnueabihf-nm -D target/usr/lib/libmmal_util.so | grep port_pool_create
* The ARM version of //nm// is used, even though it's not necessary for this output.
* The -D parameter is used, since this is a dynamic shared library.
The output:
000041bc T mmal_port_pool_create
The **T** symbol means this is a globally defined symbol that is available to outside links.
The second step was to figure out what libraries are using those symbols. This command could help:
nm -D target/usr/lib/libmmal*.so
And searching for **port_pool_create**, we find this only under //libmmal_core.so//:
U mmal_port_pool_create
The **U** symbol (undefined) means the library is using this symbol from another library.
Further examination shows that these libraries are **co-dependent**. //mmal_core// is using symbols from //mmal_util// and //mmal_util// is using symbols from //mmal_core//.
Normally it should be enough that our own //libgstrpicamsrc.so// is loading //libmmal_core.so// and //libmmal_util.so// for everything to work. We can verify this in 3 different ways:
readelf -d target/usr/lib/gstreamer-1.0/libgstrpicamsrc.so
Or
objdump -x target/usr/lib/gstreamer-1.0/libgstrpicamsrc.so
Would both show this line:
0x00000001 (NEEDED) Shared library: [libmmal_core.so]
0x00000001 (NEEDED) Shared library: [libmmal_util.so]
We can also see that //libmmal_core.so// is requested before //libmmal_util.so//. This is actually due to linkage order when **-lmmal_core** appears before **-lmmal_util**
We can also run dependency analysis with the **ldd** tool:
LD_LIBRARY_PATH=./target/usr/lib ./host/bin/arm-buildroot-linux-uclibcgnueabihf-ldd target/usr/lib/gstreamer-1.0/libgstrpicamsrc.so
And find the dependent libraries.
libmmal_core.so => ./target/usr/lib/libmmal_core.so (0x00000000)
libmmal_util.so => ./target/usr/lib/libmmal_util.so (0x00000000)
So if these libraries are supposed to be loaded, why aren't they?
I started looking at the next level of dependency, could it be that //libmmal_core// doesn't mention needing //libmmal_util//? Running **ldd** for //libmmal_util// shows that indeed //libmmal_util// is **not NEEDED**. This flag called **DT_NEEDED** should appear when this library is linked, meaning when building the raspberry pi userland. Sometimes this is caused by a more recent compiler optimization called **as-needed**. There is more information about this at link in the reading section, but basically it means that the compiler/linker drops the NEEDED requirement from a linked library if it doesn't find that it is actually being used. This flag can be overridden using a counter flag called **no-as-needed**, and this was actually a problem at some point regarding rpi-userland [[https://github.com/raspberrypi/userland/issues/178|here]], but **this is a decoy!** This is not the issue at hand: Checking the CMake file [[https://github.com/raspberrypi/userland/blob/master/interface/mmal/core/CMakeLists.txt|here]] confirms that //mmal_util// is not present there at all:
target_link_libraries (mmal_core vcos)
But then how come this library works at all? Apparently (and I also confirmed this with my own addhoc test) it is fine to not declare this dependency as long as the parent library/executable would link against both co-depending libraries. So why doesn't it work now?
Apparently this is related to the fact buildroot is using a different dynamic linker from a different base library. Instead of the standard linux libc, it uses the [[https://uclibc-ng.org/|uclibc-ng micro library]].
After some investigation, I found out how to debug the library loading process. First support for **LD_DEBUG** must be enabled in buildroot. In the defconfig:
BR2_UCLIBC_CONFIG="softbot-uClibc-ng.config"
and in that custom file:
SUPPORT_LD_DEBUG=y
Now on the Pi, before running the command, I can run **export LD_DEBUG=all** or usually just **export LD_DEBUG=** is enough data. The results are {{ :thesis:work-journal:ld_debug.txt |here}}.
The dynamic loading process goes through (I think) two stages. First, all of mentioned libraries are loaded and dependencies are checked according to symbols. There I noticed something strange. This line appeared for every library, including //mmal_core and mmal_util//:
do_dlopen():438: Circular dependency, skipping '/usr/lib//libvcos.so'
But I don't think there really is a circular dependency there! This is very strange.
At the next stage it seems the DT_NEEDED headers are examined. And of course the ones we need are missing:
lib: /usr/lib//libmmal_core.so has deps:
/lib//libc.so.0 /usr/lib//libvcos.so
lib: /usr/lib//libmmal_util.so has deps:
/lib//libc.so.0 /usr/lib//libvcos.so
I believe that they should have been loaded in the previous step and I am trying to find out what happened at **#uclibc-ng** on **freenode**.
===== Solution =====
Without knowing what is wrong with uclibc-ng, the first attempt at fix was the use a utility called //patchelf// to add the missing //DT_NEEDED// header:
(post-image.sh script)
patchelf --add-needed libmmal_core.so ${TARGET_DIR}/usr/lib/libmmal_util.so
patchelf --add-needed libmmal_util.so ${TARGET_DIR}/usr/lib/libmmal_core.so
But apparently that tool is not very stable, and using it corrupted the entire file.
OK, let's try to fork rpi userland and compile is so it does have //DT_NEEDED// headers where we need them. Trying to modify the commands as such:
(libmmal_core CMakelists.txt)
target_link_libraries (mmal_core vcos mmal_util)
(libmmal_util CMakelists.txt)
target_link_libraries (mmal_util vcos mmal_core)
Actually doesn't work, because CMake complains on the circular dependency:
CMake Error: The inter-target dependency graph contains the following strongly connected component (cycle):
"mmal_core" of type SHARED_LIBRARY
depends on "mmal_util" (weak)
"mmal_util" of type SHARED_LIBRARY
depends on "mmal_core" (weak)
At least one of these targets is not a STATIC_LIBRARY. Cyclic dependencies are allowed only among static libraries
Ok.. but why? We only want the headers that tell compiler to load both of these libraries, the dynamic linker should handle the cyclic dependency!
Apparently there is a way around this that I found pretty much by guessing. You can tell CMake to add the //DT_NEEDED// flag without actually adding the library to its precious dependncy graph like so:
(libmmal_core)
target_link_libraries (mmal_core vcos -lmmal_util)
(libmmal_util)
target_link_libraries (mmal_util vcos -lmmal_core)
Recompiling userland this way and then try to run //gst-inspect// **actually works!**
Congratulations.