ontouml_models_lib.catalog

The catalog module provides the Catalog class, designed to manage and query collections of ontology models within the OntoUML/UFO Catalog.

This module facilitates the loading, querying, and compilation of results from multiple ontology models, leveraging RDFLib for RDF graph operations and SPARQL queries.

Overview

The Catalog class is a specialized manager for handling a set of ontology models, each stored in its subdirectory within a specified catalog path. Each subdirectory is expected to contain an ontology file (ontology.ttl) and an optional metadata file (metadata.yaml). The class supports complex queries across the entire collection by compiling results from individual models into a cohesive dataset.

Usage

To use the Catalog class, instantiate it with the path to the catalog directory. The class will automatically load all ontology models from their respective subdirectories. Users can then perform SPARQL queries across the catalog, either on individual models or across the entire collection.

Examples

Example 1: Loading a Catalog and Executing a Single Query on All Models

>>> from ontouml_models_lib import Catalog
>>> from ontouml_models_lib import Query
>>> # Initialize the catalog with the path to the directory
>>> catalog = Catalog('/path/to/catalog')
>>> # Load a specific query from a file
>>> query = Query('./queries/query1.sparql')
>>> # Execute the query on all models in the catalog
>>> catalog.execute_query_on_all_models(query)

Example 2: Filtering Models by Attributes and Executing Multiple Queries

>>> from ontouml_models_lib import Catalog
>>> from ontouml_models_lib import Query
>>> # Initialize the catalog
>>> catalog = Catalog('/path/to/catalog')
>>> # Filter models that have the keyword 'ontology' or are in English language
>>> filtered_models = catalog.get_models(language="en")
>>> # Load multiple queries from a directory
>>> queries = Query.load_queries('./queries')
>>> # Execute the queries on the filtered models
>>> catalog.execute_queries_on_models(queries, filtered_models)

Example 3: Executing Multiple Queries on a Specific Model

>>> from ontouml_models_lib import Catalog
>>> from ontouml_models_lib import Query
>>> # Initialize the catalog
>>> catalog = Catalog('/path/to/catalog')
>>> # Get a specific model by ID
>>> model = catalog.get_model('some_model_id')
>>> # Load multiple queries from a directory
>>> queries = Query.load_queries('./queries')
>>> # Execute the queries on the specific model
>>> catalog.execute_queries_on_model(queries, model)

Dependencies

  • rdflib: For RDF graph operations and SPARQL query execution.

  • loguru: For logging operations and debugging information.

  • pandas: For compiling and managing query results in tabular format.

References

For additional details on the OntoUML/UFO catalog, refer to the official OntoUML repository: https://github.com/OntoUML/ontouml-models

Classes

Catalog

Manages a collection of ontology models in the OntoUML/UFO Catalog.

Module Contents

class ontouml_models_lib.catalog.Catalog(catalog_path, limit_num_models=0)

Bases: ontouml_models_lib.utils.queryable_element.QueryableElement

Manages a collection of ontology models in the OntoUML/UFO Catalog.

The Catalog class allows loading, managing, and executing queries on multiple ontology models. It compiles graphs of individual models into a cohesive dataset, enabling complex queries across the entire catalog. This class inherits from QueryableElement, which provides functionality to execute SPARQL queries using RDFLib.

Variables:
  • path (str) – The path to the catalog directory.

  • path_models (str) – The path to the directory containing the ontology model subfolders.

  • models (list[Model]) – A list of Model instances representing the loaded ontology models.

  • graph (rdflib.Graph) – An RDFLib Graph object representing the merged graph of all ontology models.

Parameters:
  • catalog_path (Union[pathlib.Path, str])

  • limit_num_models (int)

Examples

Basic usage example of the Catalog class:

>>> from ontouml_models_lib import Catalog
>>> from ontouml_models_lib import Query
>>> # Initialize the catalog with the path to the directory
>>> catalog = Catalog('/path/to/catalog')
>>> # Load a specific query from a file
>>> query = Query('./queries/query.sparql')
>>> # Execute the query on all models in the catalog
>>> catalog.execute_query_on_all_models(query)
>>> # Execute multiple queries on all models in the catalog
>>> queries = Query.load_queries('./queries')
>>> catalog.execute_queries_on_all_models(queries)
>>> # Execute multiple queries on a specific model
>>> model = catalog.get_model('some_model_id')
>>> catalog.execute_queries_on_model(queries, model)
catalog_path
path: pathlib.Path
path_models: pathlib.Path
models: list[ontouml_models_lib.model.Model] = []
graph: rdflib.Graph
execute_query_on_models(query, models, results_path=None)

Execute a specific Query on a specific Model and save the results.

This method runs a single SPARQL query on a specific ontology model and saves the results to the specified directory. If no directory is provided, the results are saved in the default “./results” directory.

Parameters:
  • query (Query) – A Query instance representing the SPARQL query to be executed.

  • models (list[Model]) – A list of Model instances on which the query will be executed.

  • results_path (Optional[Union[str, Path]]) – Optional; Path to the directory where the query results should be saved. If not provided, defaults to “./results”.

Raises:
  • FileNotFoundError – If the provided results path does not exist and cannot be created.

  • Exception – For any other errors that occur during query execution.

Return type:

None

Example:

>>> from ontouml_models_lib import Catalog
>>> from ontouml_models_lib import Query
>>> catalog = Catalog('/path/to/catalog')
>>> query = Query('./queries/query.sparql')  # Load a SPARQL query from a file
>>> model = catalog.get_model('some_model_id')  # Retrieve a model by its ID
>>> catalog.execute_query_on_model(query, model)
execute_query_on_all_models(query, results_path=None)

Execute a single Query instance on all loaded Model instances in the catalog and save the results.

This method runs a single SPARQL query across all ontology models loaded in the catalog. The results are saved in the specified directory, or in the default “./results” directory if no directory is provided.

Parameters:
  • query (Query) – A Query instance representing the SPARQL query to be executed on all models.

  • results_path (Optional[Union[str, Path]]) – Optional; Path to the directory where the query results should be saved. If not provided, defaults to “./results”.

Raises:
  • FileNotFoundError – If the provided results path does not exist and cannot be created.

  • Exception – For any other errors that occur during query execution.

Return type:

None

Example:

>>> from ontouml_models_lib import Catalog
>>> from ontouml_models_lib import Query
>>> catalog = Catalog('/path/to/catalog')
>>> query = Query('./queries/query.sparql')  # Load a SPARQL query from a file
>>> catalog.execute_query_on_all_models(query)  # Execute the query on all models
execute_queries_on_model(queries, model, results_path=None)

Execute a list of Query instances on a specific Model instance and save the results.

This method runs multiple SPARQL queries on a specific ontology model. The results of each query are saved in the specified directory, or in the default “./results” directory if no directory is provided.

Parameters:
  • queries (list[Query]) – A list of Query instances to be executed on the model.

  • model (Model) – A Model instance on which the queries will be executed.

  • results_path (Optional[Union[str, Path]]) – Optional; Path to the directory where the query results should be saved. If not provided, defaults to “./results”.

Raises:
  • FileNotFoundError – If the provided results path does not exist and cannot be created.

  • Exception – For any other errors that occur during query execution.

Return type:

None

Example:

>>> from ontouml_models_lib import Catalog
>>> from ontouml_models_lib import Query
>>> catalog = Catalog('/path/to/catalog')
>>> model = catalog.get_model('some_model_id')  # Retrieve a model by its ID
>>> queries = Query.load_queries('./queries')  # Load multiple SPARQL queries from a directory
>>> catalog.execute_queries_on_model(queries, model)
execute_queries_on_models(queries, models, results_path=None)

Execute a list of Query instances on a list of Model instances and save the results.

This method runs multiple SPARQL queries across a specified set of ontology models in the catalog. The results of each query on each model are saved in the specified directory, or in the default “./results” directory if no directory is provided.

Parameters:
  • queries (list[Query]) – A list of Query instances to be executed on the models.

  • models (list[Model]) – A list of Model instances on which the queries will be executed.

  • results_path (Optional[Union[str, Path]]) – Optional; Path to the directory where the query results should be saved. If not provided, defaults to “./results”.

Raises:
  • FileNotFoundError – If the provided results path does not exist and cannot be created.

  • Exception – For any other errors that occur during query execution.

Return type:

None

Example:

>>> from ontouml_models_lib import Catalog
>>> from ontouml_models_lib import Query
>>> catalog = Catalog('/path/to/catalog')
>>> models = catalog.get_models(language="en")  # Filter models by language
>>> queries = Query.load_queries('./queries')  # Load multiple SPARQL queries from a directory
>>> catalog.execute_queries_on_models(queries, models)
execute_queries_on_all_models(queries, results_path=None)

Execute a list of Query instances on all loaded Model instances in the catalog and save the results.

This method runs multiple SPARQL queries across all ontology models loaded in the catalog. The results of each query on each model are saved in the specified directory, or in the default “./results” directory if no directory is provided.

Parameters:
  • queries (list[Query]) – A list of Query instances to be executed on all models.

  • results_path (Optional[Union[str, Path]]) – Optional; Path to the directory where the query results should be saved. If not provided, defaults to “./results”.

Raises:
  • FileNotFoundError – If the provided results path does not exist and cannot be created.

  • Exception – For any other errors that occur during query execution.

Return type:

None

Example:

>>> from ontouml_models_lib import Catalog
>>> from ontouml_models_lib import Query
>>> catalog = Catalog('/path/to/catalog')
>>> queries = Query.load_queries('./queries')  # Load multiple SPARQL queries from a directory
>>> catalog.execute_queries_on_all_models(queries)  # Execute the queries on all models
get_model(model_id)

Retrieve a model from the catalog by its ID.

This method searches for a model within the catalog’s loaded models by its unique ID. If a model with the specified ID is found, it is returned. Otherwise, a ValueError is raised.

Parameters:

model_id (str) – The ID of the model to retrieve.

Returns:

The model with the specified ID.

Return type:

Model

Raises:

ValueError – If no model with the specified ID is found.

Example:

>>> from ontouml_models_lib import Catalog
>>> catalog = Catalog('/path/to/catalog')
>>> model = catalog.get_model('some_model_id')  # Retrieve a model by its unique ID
get_models(operand='and', **filters)

Return a list of models that match the given attribute restrictions.

This method filters the loaded models based on specified attribute restrictions. It supports logical operations (“and” or “or”) to combine multiple filters. If only a single filter is provided, the operand is ignored.

Parameters:
  • operand (str) – Logical operand for combining filters (“and” or “or”). Defaults to “and”.

  • filters (dict[str, Any]) – Attribute restrictions to filter models. Attribute names and values should be passed as keyword arguments. Multiple values for the same attribute can be provided as a list.

Returns:

List of models that match the restrictions.

Return type:

list[Model]

Raises:

ValueError – If an invalid operand is provided.

Example:

>>> from ontouml_models_lib import Catalog
>>> catalog = Catalog('/path/to/catalog')
>>> # Filter models by a single attribute (language)
>>> filtered_models = catalog.get_models(language="en")
>>> # Filter models by multiple keywords
>>> filtered_models = catalog.get_models(operand="or", keyword=["safety", "geology"])
>>> # Filter models by multiple attributes (keyword and language)
>>> filtered_models = catalog.get_models(operand="and", keyword="safety", language="en")
remove_model_by_id(model_id)

Remove a model from the catalog by its ID.

This method searches for a model within the catalog’s loaded models by its unique ID. If a model with the specified ID is found, it is removed from the catalog. Otherwise, a ValueError is raised.

Parameters:

model_id (str) – The ID of the model to remove.

Raises:

ValueError – If no model with the specified ID is found.

Return type:

None

Example:

>>> from ontouml_models_lib import Catalog
>>> catalog = Catalog('/path/to/catalog')
>>> catalog.remove_model_by_id('some_model_id')  # Remove a model by its unique ID
_load_models(limit_num_models)

Load ontology models from the catalog directory.

This method scans the catalog directory for subfolders, each representing an ontology model. It loads the models from these subfolders and initializes them as instances of the Model class. The loaded models are stored in the models attribute of the Catalog instance.

Parameters:

limit_num_models (int) – The maximum number of models to load. If 0, load all models.

Raises:

FileNotFoundError – If the catalog directory does not contain any model subfolders.

Return type:

None

_get_subfolders()

Retrieve the names of all subfolders in the catalog directory.

This method identifies all subfolders within the catalog’s path_models directory, which represent individual ontology models. The subfolders are expected to have the necessary ontology and metadata files for each model.

Returns:

A list of subfolder names within the catalog directory.

Return type:

list

_create_catalog_graph()

Create a merged RDFLib graph from all loaded models in the catalog.

This method combines the RDF graphs of all loaded models into a single RDFLib Graph object. The merged graph allows for executing SPARQL queries across the entire catalog, enabling comprehensive queries that span multiple models.

Returns:

A merged RDFLib graph containing all triples from the models in the catalog.

Return type:

Graph

_match_model(model, filters, operand)

Check if a model matches the specified attribute filters.

This method evaluates whether a model meets the attribute restrictions defined in the filters dictionary. It supports combining multiple filters using logical operations (“and” or “or”), and returns a boolean indicating whether the model satisfies the filter conditions.

Parameters:
  • model (Model) – The model to be checked against the filters.

  • filters (dict[str, Any]) – A dictionary of attribute restrictions used to filter models. Attribute names and values are passed as keyword arguments.

  • operand (str) – Logical operand for combining filters (“and” or “or”). Defaults to “and”.

Returns:

True if the model matches the filter conditions; False otherwise.

Return type:

bool

Raises:

ValueError – If an invalid operand is provided (not “and” or “or”).

_match_single_filter(model, attr, value)

Check if a model matches a single attribute filter.

This method evaluates whether a model meets a single attribute restriction. It compares the model’s attribute value to the provided filter value and returns a boolean indicating whether there is a match. It handles both single-value and list-value comparisons.

Parameters:
  • model (Model) – The model to be checked against the filter.

  • attr (str) – The name of the attribute to filter by.

  • value (Any) – The expected value or list of values to filter by.

Returns:

True if the model’s attribute matches the filter; False otherwise.

Return type:

bool