Gamifying Annotations, Digital Sovereignty, AI Engineer Fair, and Open Source Contributions


Gamepad support for Python is here!

https://www.youtube.com/watch?v=4fXLB5_F2rg

It's rare to find something that is universally agreed upon. Everyone agrees that data annotation is an important part of the AI system pipeline but the thing is that nobody actually wants to do the annotations. It's often associated with tedious, manual, monotonous work that's just a chore to get done so you can proceed to more exciting things. That's why it's so important to do everything you can do to make annotating as frictionless as possible.

I don't believe that annotating data is all that bad. When I participated as the Danish language lead for the FineWeb-C multilingual dataset I decided to give it a go myself. The project used Argilla as the data annotation platform. It was really simple, intuitive and - once you got going - very fast to use. I ended up annotating 1,000 texts on their educational content just by loading it up on my iPad and doing a bit here and there.

In this video, Vincent Warmerdam shows that you can use an 8BitDo gamepad and build a simple interface in Marimo to provide a fun and interactive annotation experience. I like the way he's thinking!

Både Københavns og Aarhus Kommune vil droppe Microsoft

https://www.dr.dk/nyheder/seneste/baade-koebenhavns-og-aarhus-kommune-vil-droppe-microsoft

Digital sovereignty is a very hot topic in Denmark at the moment. Primarily due to the geopolitical instabilities that the current US administration imposes. I have heard people say that "Denmark is a Microsoft country" so much over the past few years. It always nagged me. Analogous to how Steven Balmer in the early 2000s said that Linux is a cancer I believe that companies whose primary job is to spread Microsoft products and services is largely cancerous to Danish society and threatens our sovereignty. Our politicians often brag about Denmark's level of digitisation - but at what cost? We currently don't own our (critical) infrastructure.

It is uplifting to see media like Ingeniøren covering stories about how the US is forcing Microsoft to block email access for Chief Prosecutor of the ICC and promoting how the French are using open source software in their government.

I could go on and on about this. I'll probably have another issue with links to some Netcompany-bashing very soon.

One thing I would like to mention is that I'm so grateful that David Heinemeier Hansson is using his voice and authority to set the record straight on this topic. His expertise and viewpoints are so valuable for the debate. I think we would be overwhelmed with wolves in the discussion of how to secure the chickens, to borrow his analogy, had it not been for his contributions.

AI Engineer World's Fair

https://www.youtube.com/@aiDotEngineer/streams

When I first heard the role "AI Engineer" I quickly brushed it off as being a silly attempt at iterating on ML Engineer, Data Scientist etc. that I didn't think would catch on. Having worked with AI professionally for about a year now I see very clearly that I was wrong. The role encapsulates very that organisations don't necessarily need someone to train models or do elaborate data experiments, but rather someone that can integrate AI into systems through engineering practices.

Swyx is doing a lot in the AI Engineering space. I enjoy his AI News newsletter as well as the latent.space podcast.

I'll be looking forward to browsing through the catalog of talks once the fair is over.

Good and Easy Way to Contribute to Open Source

https://bsky.app/profile/simonwillison.net/post/3lqo2fzc5yk2i

The Danish Data Science Community recently put together two committees: One for Open Source initiatives and one for Events - and I'm part of the Open Source committee!

It has made me think about how Open Source work is on a spectrum and how there exists high-impact opportunities everywhere on that spectrum. As exemplified by Simon Willison who points out in a post on Bluesky that simply writing up a small tutorial post about usage of an open source project is greatly beneficial. A post like that makes for discoverability and breaking down barriers for engaging with the project. I find it really encouraging to know that contributions like that are appreciated.