Skip to content

Learning Records Converter is a tool enabling the interoperability of Learning Records in various formats (xAPI, SCORM, IMS Caliper, cmi5, proprietary). It is a common building block for the Data space of Education & Skills and developed within the Prometheus-X ecosystem.

License

Notifications You must be signed in to change notification settings

Prometheus-X-association/learning-records-converter

Repository files navigation

Learning Records Converter (LRC)

Overview

Learning Records are available in many formats, either standardized (xAPI, SCORM, IMS Caliper, cmi5) or proprietary (Google Classroom, MS Teams, csv, etc). This wide variety of formats is a barrier to many use cases of learning records as it prevents the easy combination and sharing of learning records datasets from multiple sources or organizations.

As a result, Inokufu was tasked, within the Prometheus-X ecosystem, to develop a Learning Records Converter which is a parser translating datasets of learning traces according to common xAPI profiles.

Approach

LRC facilitates a streamlined conversion process through a two-phase operation, which ensures that the input data is correctly interpreted, transformed, augmented, and validated to produce compliant JSON outputs. The first phase converts a Learning Record from various input formats, into a single xAPI format. The second phase converts the xAPI learning records to ensure that they comply with the xAPI DASES Profiles.

Here is an architecture diagram illustrating the approach of the LRC to parse an input Learning Record in various standard, custom or unknown formats into an output Learning Record according to DASES xAPI profile.

LRC_Phase1.png

Phase 1: Learning Records to xAPI

The aim of this first phase is to convert a Learning Record to xAPI. In order to do this, we will set up two consecutive processes: Input Data Validation, and Data Transformation.

  • Input Data Validation will be responsible for interpreting and validating the input data format (supplied in JSON format),
  • Data Transformation will be in charge of transforming input data into xAPI format, where possible.

Input Data Validation

This component's role is to identify the format of the input Learning Record, and to validate that the records are valid. Each dataset of Learning Records will have a metadata attribute stating the input-format of the learning record.

If the input-format is known, the corresponding data descriptor will be loaded to validate Learning Records are compliant. Otherwise, every data descriptors will be loaded, and try to interpret learning records.

Data Transformation

This component's role is to convert the validated input data into the xAPI format.

Depending on the input-format of the learning records dataset, the processing will differ as follows:

  • if the input-format is xapi (cmi5 is considered a xapi profile), the conversion will be skipped.
  • if the input-format is standard (scorm, ims_caliper), the corresponding mapping will be used by the component to process the learning record. For each standard, there is a corresponding mapper which enables the formatting of the learning record into the xapi format.
  • if the input-format is unknown, then the Data Transformation module will do its best to automatically map each item of a learning record into the xapi format.

The first phase of the LRC is built with community collaboration in mind. It allows for easy contributions and extensions to both the input and output formats. The community can develop and share their own data descriptors and converters, which can be seamlessly integrated into the LRC's ecosystem, thereby enhancing the application's versatility to handle various input and output formats.

Here is a detailed architecture diagram illustrating the first phase of the LRC, to parse an input Learning Record in various standard, custom or unknown formats into an output Learning Record according to the xAPI standard.

LRC_Phase1.png

These two consecutive processes can be summarized by this flow chart.

LRC_Phase1_FlowChart.png

Phase 2: xAPI to DASES

The aim of this second phase is to transform the xAPI Learning Record according to DASES xAPI profiles.

Each profile is defined in a JSON-LD file and includes specific concepts, extensions, and statement templates that guide the transformation and validation of xAPI statements. The DASES profiles in JSON-LD format are automatically downloaded and updated from their respective GitHub repositories as defined in the .env file.

The LRC enriches xAPI statements with profile-specific data, validates statements against profile rules, and give recommendations for improving compliance with the profiles.

This ensures that the converted learning records are not just in xAPI format, but also adhere to the specific DASES profile standards, enhancing interoperability and consistency across different learning systems.

Enriched Fields
  • verb.id: Set to the appropriate verb URI (e.g., "https://w3id.org/xapi/netc/verbs/accessed" for accessing a page)
  • verb.display.en-US: Human-readable description of the verb
  • object.definition.type: Set to the appropriate activity type URI
  • context.contextActivities.category: Includes a reference to the associated profile

DASES Profiles in Detail

The LRC currently supports theses main profiles :

LMS Profile
  • Purpose: Standardizes tracking of common LMS activities.
  • Key Concepts: Includes verbs like 'accessed', 'downloaded', 'registered', and 'uploaded'.
  • Activity Types: Covers webpages, files, courses, and various media types.
  • Extensions: Provides context for activities like file types, session IDs, and user roles.
Forum Profile
  • Purpose: Captures forum-specific interactions in learning environments.
  • Key Concepts: Includes verbs related to posting, replying, and viewing forum content.
  • Activity Types: May cover forum threads, posts, and user interactions.
Assessment Profile
  • Purpose: Tracks assessment-related activities and results.
  • Key Concepts: Includes verbs like 'started', 'terminated', and 'completed'.
  • Activity Types: Covers different types of assessments and question formats.

Setup and installation

You can run the application either directly with pipenv or using Docker.

But first, clone the repository:

git clone [repository_url]
cd [project_directory]

Then, set up environment variables: create a .env file in the project root by copying .env.default:

cp .env.default .env

You can then modify the variables in .env as needed.

With Docker

The application is containerized using Docker, with a robust and flexible deployment strategy that leverages:

  • Docker for containerization with a multi-environment support (dev and prod) using Docker Compose profiles
  • Traefik as a reverse proxy and load balancer, with built-in SSL/TLS support via Let's Encrypt, and a dashboard in dev environment.
  • Gunicorn as the production-grade WSGI HTTP server, with configurable worker processes and threads, and dynamic scaling based on system resources.

Prerequisites

  • Docker and Docker Compose installed on your machine.

Development Environment

Build and run the development environment:

docker-compose --profile dev up --build

The API will be available at : http://lrc.localhost

Traefik Dashboard will be available at : http://traefik.lrc.localhost

Quick Start (Without volumes or Traefik)

For a quick test without full stack:

docker build --target dev-standalone -t lrc-dev-standalone .
docker run -p 8000:8000 lrc-dev-standalone

Note: This version won't reflect source code changes in real-time.

Production Environment

Configure production-specific settings, then build and run the production environment:

docker-compose --profile prod up --build

With pipenv

Prerequisites

  • Python 3.12 (Note: The project is developed and tested with Python 3.12. It may work with later versions, but this is not guaranteed.)

Installation

  1. Install pipenv if you haven't already:

    pip install pipenv
    
  2. Install the project dependencies:

    pipenv install
    
  3. Start the FastAPI server using the script defined in Pipfile:

pipenv run start

Running the Application

The API will be available at http://localhost:8000.

Endpoints

Convert Traces

To convert a trace, send a POST request to the /convert endpoint:

POST /convert
Content-Type: application/json

{
  "input_trace": {
    // Your input trace data here
  },
  "input_format": "<input_format>"
}

Supported input formats:

  • xapi
  • imscaliper1_2
  • imscaliper1_1
  • scorm_2004
  • matomo

Response format:

{
  "output_trace": {
    // Converted xAPI trace data
  },
  "recommendations": [
    {
      "rule": "presence",
      "path": "$.result.completion",
      "expected": "included",
      "actual": "missing"
    }
  ],
  "meta": {
    "input_format": "<input_format>",
    "output_format": "<output_format>",
    "profile": "<DASES profile found>" // Optional, present when a DASES profile is detected
  }
}

The meta object contains essential information about the conversion process.

Custom Mapping

The /convert_custom endpoint allows for flexible conversion of custom data formats using mapping files:

POST /convert_custom
Content-Type: multipart/form-data

data_file: <your_data_file>
mapping_file: <your_mapping_file>
config: { // Optional
  "encoding": "utf-8",
  "delimiter": ",",
  "quotechar": "\"",
  "escapechar": "\\",
  "doublequote": true,
  "skipinitialspace": true,
  "lineterminator": "\r\n",
  "quoting": "QUOTE_MINIMAL"
}
output_format: "xAPI" (default)

The endpoint supports:

  • CSV files with custom parsing configurations
  • Automatically detects delimiters and structure if not provided
  • Custom mapping files for data transformation
  • Normalizes input data for consistent JSON output
  • Built-in date format conversion to xAPI requirements
  • Streaming response for large datasets

Example mapping file structure:

Validate Traces

The endpoint will:

  • Validate the trace structure and content
  • Attempt to detect the format if not provided
  • Verify against the specified format if provided
  • Return validation errors if the trace is invalid, against Pydantic models
  • Return the confirmed input format.

Send a POST request to the /validate endpoint:

POST /validate
Content-Type: application/json

{
  "input_trace": {
    // Your input trace data here
  },
  "input_format": "<input_format>" // Optional
}

Response format:

{
  "input_format": "<detected_or_confirmed_format>"
}

Development

API Documentation

Once the server is running, you can access the interactive API documentation:

  • Swagger UI: Available at /docs
  • ReDoc: Available at /redoc

These interfaces provide detailed information about all available endpoints, request/response schemas, and allow you to test the API directly from your browser.

Code Formatting and Linting

The project uses Ruff for linting and formatting. Ruff is configured in pyproject.toml with strict settings:

  • All rules enabled by default
  • Python 3.12 target version
  • 88 character line length
  • Custom rule configurations for specific project needs

Mapping

To understand how mapping works or to create your own mapping, a document is available here.

Project Architecture

An explanation of how the project is organised is available here.

Environment Variables

The following table details the environment variables used in the project:

Variable Description Required Default Value Possible Values
Environment Configuration
ENVIRONMENT Application environment mode No development development, production
LOG_LEVEL Minimum logging level No info debug, info, warning, error, critical
DOWNLOAD_TIMEOUT Timeout for downloading profiles No 10 Positive integer
CORS_ALLOWED_ORIGINS Allowed CORS origins No * Comma-separated origins
Internal Application Configuration
APP_INTERNAL_HOST Host for internal application binding No 0.0.0.0 Valid host/IP
APP_INTERNAL_PORT Port for internal application binding No 8000 Any valid port
External Routing Configuration
APP_EXTERNAL_HOST External hostname for the application Yes lrc.localhost Valid hostname
APP_EXTERNAL_PORT External port for routing (dev env only) No 80 Any valid port
Traefik Configuration
TRAEFIK_RELEASE Traefik image version No v3.2.3 Valid Traefik version
LETS_ENCRYPT_EMAIL Email for Let's Encrypt certificate Yes [email protected] Valid email
Profile Configuration
PROFILES_BASE_PATH Base path for storing profile files Yes data/dases_profiles Valid directory path
PROFILES_NAMES Names of profiles to use Yes lms,forum,assessment Comma-separated profile names
PROFILE_LMS_URL URL for LMS profile JSON-LD Yes GitHub LMS profile URL Valid URL
PROFILE_FORUM_URL URL for Forum profile JSON-LD Yes GitHub Forum profile URL Valid URL
PROFILE_ASSESSMENT_URL URL for Assessment profile JSON-LD Yes GitHub Assessment profile URL Valid URL
Performance Configuration
WORKERS_COUNT Number of worker processes No 4 Positive integer
THREADS_PER_WORKER Number of threads per worker No 2 Positive integer

Note: The URLs for the profiles are examples and may change. Always use the most up-to-date URLs for your project.

Refer to .env.default for a complete list of configurable environment variables and their default values.

Errors

The API uses standard HTTP status codes:

Status Code Description Possible Causes
400 Bad Request Invalid input format, malformed JSON
404 Not Found Invalid endpoint, resource doesn't exist (profile file)
422 Validation Error Format validation failed
500 Internal Server Error Server-side processing error

Notes:

  • Error responses include a detail field with human-readable message
  • Development mode includes additional debug information
  • Production mode omits sensitive error details

Contribution guidelines

We welcome and appreciate contributions from the community! There are two ways to contribute to this project:

  • If you have a question or if you have spotted an issue or a bug, please start a new issue in this repository.
  • If you have a suggestion to improve the code or fix an issue, please follow these guidelines:
    1. Fork the Repository: Fork the Learning Records Converter repository to your own GitHub account.
    2. Create a Branch: Make a new branch for each feature or bug you are working on.
    3. Make your Changes: Implement your feature or bug fix on your branch.
    4. Submit a Pull Request: Once you've tested your changes, submit a pull request against the Learning Records Converter's master branch.

Before submitting your pull request, please ensure that your code follows our coding and documentation standards. Don't forget to include tests for your changes!

Project status

Please note this project is work in progress.

  • State of the art of the latest evolutions of learning traces standards
  • Quantitative inventory of the main software learning outcomes standards and tools used in the field of education and training in France and in Europe from the list identified in the working groups of the Data space Education & Skills (i.e. SCORM, xAPI, cmi5, IMS Caliper)
  • Definition of the architecture of the API endpoints in accordance with the technical recommendations of GAIA-X
  • Development of the endpoints necessary for parsing the various priority standards identified above
  • API testing with model datasets provided by Prometheus volunteer partners
  • Deployment of the service in a managed version in one of the partner cloud providers
  • Development of automated service deployment scripts for multi-cloud use (infrastructure as code e.g. Terraform) at partner cloud providers
  • Drafting of the final public documentation

Interoperability of Learning Records: State-of-the-Art in 2023

As a preparatory work for the development of the Learning Records Converter, Inokufu has conducted an exhaustive state of the art and quantitative study about the interoperability of Learning records.

This study is available here

References

https://gaia-x.eu/gaia-x-framework/

https://prometheus-x.org/

https://dataspace.prometheus-x.org/building-blocks/interoperability/learning-records**

About

Learning Records Converter is a tool enabling the interoperability of Learning Records in various formats (xAPI, SCORM, IMS Caliper, cmi5, proprietary). It is a common building block for the Data space of Education & Skills and developed within the Prometheus-X ecosystem.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •