COSMOS: Conceptual Modelling, Semantic Documentation and Semantic Data Management Support

๐Ÿ“… 2025 โ€“ Present

Project Context / Objective:

The Max Planck Institute for the History of Science (MPIWG) maintains multiple independent research datasets, of which three โ€” Rare Books, Machine Drawings, and VoH Images โ€” were selected for semantic data analysis and migration. Takin.Solutions was commissioned to develop standardised semantic reference models, produce X3ML mappings, and define a middle-layer model to enable graph-based integration, Linked Data publication, and sustainable long-term reuse of these datasets.


Takin.Solutions’ Role / Contributions:

Takin.Solutions acted as semantic modelling consultant and mapping specialist, responsible for the full workflow from dataset analysis through to finalised documentation and cross-dataset integration design.


Key Activities:

  • Analysed datasets and scholarly use cases to inform modelling decisions.
  • Designed CIDOC CRMโ€“aligned semantic reference models, extending FRBRoo/LRMoo as needed for the specific characteristics of each dataset.
  • Created visual entity diagrams to support review and communication with the MPIWG team.
  • Developed and refined X3ML mappings and URI policies for all three datasets.
  • Defined a cross-dataset semantic middle layer to enable integrated research and graph-based querying across collections.
  • Delivered finalised documentation via Zellij, Takin’s semantic documentation platform.

Deliverables / Outputs:

  • Semantic reference data models for three datasets: Rare Books, Machine Drawings, and VoH Images.
  • Production-ready X3ML mappings, including a replacement mapping for the Rare Book Collection.
  • Visual entity diagrams and cross-dataset semantic middle layer documentation.
  • Finalised semantic documentation structured for long-term reuse.

Outcome / Impact:

The project provides MPIWG with a unified semantic framework across three previously independent research collections, enabling interoperable Linked Data publication, graph-based queries, and a scalable foundation for the integration of future datasets into a coherent, standards-aligned knowledge infrastructure.