Skip to content

elliottd/danish-foundation-models

 
 

Repository files navigation

Code style: black

A collaborative project for training foundational Danish language model. Which seeks to:

  • Develop and maintain state-of-the-art models for Danish,
  • which are well-validated across a wide range of tasks.
  • Furthermore, we wish to ensure good documentation, which allows users to assess the model for their use-case critically
  • Open-source, both model and source code

Note: This repository is intended for the text model of DFM.

More information:

For more information please check out the following links:

📑 About A overview of the DFM project
Research Paper An paper introducing DFM and its rationale
🚀 Models A overview of current models available through the DFM project
💽 Datasets Includes datasheets about the datasets which includes preprocessing, reason for constructions and more.

Wish to contribute?

DFM is considered a collaborative project for training and maintaining Danish Language models. If you wish to contribute don't hesitate to reach out using one of the following channels:

🗣 DDSC Slack Join the discussion in the "danish-foundation-models"-channel
💬 GitHub Discussion Ask questions or start a discussion
🚨 GitHub Issues Notices a bug in the code? Please create an issue

You can contribute both:

  • Developer time, the lifeblood of any open-source project
  • Pre-training datasets you wish to include in the model training
  • Validation tasks can even be private benchmarks where you only wish to share the performance metrics.
  • And probably in many other ways

Setting up development environment

Method 1: Dev container

By far the easiest way is to use our included development container. If you're using VSCode:

  • Ensure you have either Orbstack or Docker installed
  • Press this button: Open in Dev Container
  • Select "From Dockerfile"
  • Press "OK" on the feature screen

Method 2: Manual install

Install as you usually would, replicating the commands in the Dockerfile.dev.

Current Contributors and Collaborators

This project has collaborators across industry, national institutions and research centers. This project uses compute resources supplied by Ucloud through the DeiC e-infrastructure grant.

About

A project for training foundational Danish language model

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 97.1%
  • Shell 2.5%
  • Other 0.4%