I'm an ambitious programmer who specializes in machine learning and data analysis. My professional passion is to have a positive influence on the world through data centric solutions - whether that is by automating tasks, providing insights to stakeholders with machine learning or simplifying and explaining complex data through visualizations or statistics. I enjoy a steep learning curve and I'm quick at getting familiar with new business terminology and technologies.
I have professional experience with machine learning research, data pipelines, data visualization, development of machine learning models, preparing machine learning models for production and machine learning operations (MLOps) from my time as Chief Technical Officer of Alvenir and the pre-company DanSpeech.
I have strong software development skills from working in an IT consultancy firm and I know my way around microservice architecture, monolith architecture, databases, front-end development, cloud infrastructure, cloud services and various APIs, frameworks and programming languages.
I have experience working with agile methodoligies such as Scrum and SAFe and I find well-planned working procedures important in order to efficiently progress.
Python is my go-to language when doing machine learning and data analysis and I'm very proficient in writing clean Python code.
I used kubernetes (GKE and EKS) at Alvenir (and at DTU before we were Alvenir) to run our speech recognition platform. We managed our applications using Helm and Helmfile and I set everything up from cluster, custom helm charts, integrations and deployment scripts. During my time at Netcompany, I also had an encounter with OpenShift.
Since my second year of my studies, I have worked with machine learning. I have trained state-of-the-art ASR and NLP models, and deployed them to a production. I know how to optimize inference by e.g. using onnx or pre-build docker images. I'm proficient in many machine learning frameworks and can quickly learn new ones.
I have worked with many different technologies and more Python frameworks than reasonable to list. Here is curated list of my skills.
Python
Kafka
Pytorch
Tensorflow
Numpy
DevOps
MLOPs
Pandas
onnx
Fast API
Django
(Huggingface) Transformers
Scikit-learn
Kubernetes
Seldon
Machine Learning
Data visualization
d3
Linux
Amazon Web Services (AWS)
Google Cloud Platform (GCP)
Docker
Helm
Helmfile
Terraform
Jenkins
Java
Kotlin
Groovy
Javascript
Typescript
Angular
Gradle
Ebean
Spring (Boot, Web, Cloud, Security)
Liquibase
Flyway
Hibernate
Bash
css/scss
html
Git
SQL
Filter my timeline using the below buttons.
I am a big fan of open-source technology! I think sharing and collaborating is very important in order to move forward faster and to ensure that essential high-quality machine learning models (e.g. speech recognition models) are available for everyone and not just a few companies. I sometimes contribute to open-source libraries in my sparetime because I really enjoy the challenge of getting familiar with a completely new codebase. Below is a list of some of my open-source contributions that I have been a major part of.
Punctuation restoration Python package for Danish, English and German. Punctation models based on BERT architecture.
Python package for automatic speech recognition based on DeepSpeech 2 architecture.
A Danish pre-training of wav2vec2-large architecture using ~120000 hours of speech data. Collaboration with Århus University.
A finetuning with NST data (approximately 200 hours) and Danish part of Common Voice 9. Collaboration with Århus University.
A Danish pre-training of wav2vec2-base architecture using ~1300hours of speech data.
A finetuning with NST data (approximately 200 hours) of the wav2vec2-base model.
A finetuning based on a custom filtered subset of Danish mc4 able to determine where to perform punctuation restoration.
An evaluation dataset for danish speech recognition consisting of ~5hours of speech.