Skip to main content

Python Azure ML SDK issue on Ubuntu 22.04

It has been quite a while since I posted last time.

Why? Because simply I did not run into any issue worth to share. But now! I did. 

Recently we are doing some Machine Learning on Azure using Azure Machine Learning Python SDK. No problem you might think. Well. As it turned out Ubuntu 22.04 is not supported. And this is clearly said in a message. Which is in fact a lie.

The Error message:

NotImplementedError: Linux distribution ubuntu 22.04 does not have automatic support.
Missing packages: {'liblttng-ust.so.0'}
.NET Core 3.1 can still be used via `dotnetcore2` if the required dependencies are installed.
Visit https://aka.ms/dotnet-install-linux for Linux distro specific .NET Core install instructions.
Follow your distro specific instructions to install `dotnet-runtime-*` and replace `*` with `3.1.23`.


Ok but what is this? And why?

So as the error mentions dotnetcore2==3.1.23 Python package uses .NET Core 3.1 but Ubuntu 22.04 has only dotnet6 packages. And also Microsoft does not have the packages for Ubuntu 22.04 it seems so.

The other problem as I have already mentioned the message is a kind of a lie. What the python code does when you load some python file from dotnetcore2 is, it tries to find all *.so in the directory of the  dotnetcore2 package, because just to let you know dotnetcore2 delivers tons of .so file, and than calls ldd on each and every .so and looks for a not found message in the command output. This is how it tries to make sure each and every required library is available. If something is missing it adds that missing library to its list. If at the end of the "scan" the list is not empty it lists the file as in the above error message and, here is the trick, prints the distro and version in the unsupported message. 
Spoiler: But you can make  it working.

But what to do than?

To me the following was a working solution on Ubuntu 22.04 following part of this guide.

Use the snap version of dotnet and install version 3.1 of the sdk and runtime.

sudo snap install dotnet-sdk --classic --channel=3.1
sudo snap alias dotnet-sdk.dotnet dotnet
sudo snap install dotnet-runtime-31 --classic



In your Python virtual environment you will need dotnetcore2. I have version 3.1.23.
If you are ready you can assume it will work. But it will not. The libs from the snaps are not part of the system library search. So you need to add that.

NOW COMES THE DANGEROUS PART!!! DO NOT MAKE A MISTAKE!!!

You need to add the libraries installed by the snap to your ld.so.conf. However some of the libraries delivered by Microsoft conflicts with the system ones. So make sure - by the file naming of the config file - to have the lowest priority of these snap libs.

sudo bash -c 'cat <<EOF >/etc/ld.so.conf.d/xxxdotnet.conf
/snap/dotnet-runtime-31/current/usr/lib/x86_64-linux-gnu
/snap/dotnet-sdk/current/usr/lib/x86_64-linux-gnu/
EOF'



MAKE SURE THE FILE NAME STARTS WITH XXX!!!
The xxx part of the filename puts the content of the file at the end of the ld.so chain even after x86_64-linux-gnu.conf

Now run


sudo ldconfig
ldconfig -v 2>/dev/null|grep "^/snap"


And now the bellow Python call should work

dataset: FileDataset = Dataset.File.from_files(path=dp)
mount_context : MountContext = dataset.mount(".data") mount_context.start()


If you made a mistake and your system is ruined boot Ubuntu in rescue mode delete /etc/ld.so.conf.d/xxxdotnet.conf and than run sudo ldconfig still in recovery mode.

Final words

There are other ways to fix it. If you have read above what the dotnetcore2 package does, you can solve the ldd issue in other ways too.
And I also hope Microsoft will fix it soon since using this now is quite a pain. 
However Ubuntu 22.04 has been released in April 2022 and as of now, 6 month later, still not working.

Comments

Popular posts from this blog

Insufficient Disk Space reported under wine

Did you try to install/setup any Windows Application - actually a Game what else could be necessary - and got a message that you do not have enough free space on your drive meanwhile you had lot of free space on the chosen mounted partition? You will learn the problem and hopefully the solution too. (Of course I suppose it is not the real situation you have no enough space. If so do not read ahead.) The problem is that wine does not check the amount of free space on the mounted partition corresponds to the selected directory but reports the free on the root of the directory the partition mounted to . ;( Probably it is not clean so here is an example: Let say you have / only and something is mounted as /mnt/part1 If you directly select /mnt/part1 during installation wine will check free space in fact on / and does not calculate free on the partition mounted under /mnt/part1. How to solve it you may ask? It is easy. Start winecfg and create a new drive with the directory you want to use....

User based queue mapping for Capacity Scheduler

When I  started to use Capacity Scheduler hierarchical queue features on top of Hortonworks' HDP 2.0 I have immediately realized that I need automatic assignment of job to queue based on username. Sounds easy and useful? Yes! But could not find any configuration parameter and example for that. I found only references to use mapred.job.queuename config option. This can be configured in HIVE via set mapred.job.queuename=yourqueue or using -Dmapred.job.queuename=yourqueue as a hadoop command argument. After some hours of unavailing googling I have checked the corresponding code part and have been shocked. This is available only since HADOOP-2.6 (HDP-2.2). Check YARN-2411 for details. According to the CHANGELOG this is a relatively new feature. So sadly this is not available to me until an upgrade. :( See below an example based on YARN-2411 to use it in Hadoop 2.6 or higher for Hortonworks HDP-2.2 1. user1 is mapped to queue1, group1 is mapped to queue2: yarn.schedul...