Promotors: Prof. Dr. F. van Harmelen, Dr. V de Boer
Co-Promotor: Dr. L. Daniele, Dr. R. Siebes
IoT devices generate substantial amounts of measurements and use many different formats to store this data. In order to make IoT devices interoperable, a number of ontologies have been developed, such as SAREF, to create knowledge graphs that can represent all kinds of measurements from IoT devices. We call the resulting knowledge graphs: IoT measurement knowledge graphs. In this thesis, we set out to investigate what sets IoT measurement knowledge graphs apart from regular knowledge graphs. In the first half, we describe in detail how to create IoT measurement knowledge graphs from real-world measurement data. In the second half, we investigate what happens when we use the graphs in different scenarios, with different methods and applications. In order to generate an IoT measurement knowledge graph, we create a mapping to transform the measurement data coming from IoT devices. We explore multiple ways to create this mapping and evaluate the resulting IoT measurement knowledge graph with competency questions. Using our mapping, we create OfficeGraph, by transforming IoT measurement data recorded over a year from 444 IoT devices located in an office building. This IoT measurement knowledge graph is validated with competency questions created with the support of the building owners, showing that they can now answer questions they were not able to answer without it. Besides the entities, literals and relations in knowledge graphs, the combination of these, the context of entities, provides additional knowledge. Therefore, if we want to use entities in knowledge graphs to train machine learning models, it would be a waste to take only the entities from the graph, because this would leave out information. Representation v models, such as RDF2Vec and GCNs can be used to learn (embedding) representations for entities that take context into account, creating a representation based on which relations, entities, and literals occur near the entity in the graph. We learn over IoT measurement knowledge graphs using multiple representation models and experiment with the effect of making more knowledge available to the representation models through semantic enrichment. This is done by making implicit information, such as (e.g.) consecutive measurements, explicitly available, by adding a relation between two measurements. Results show that the semantic enrichment has a positive effect on the learnability of the entity representations. Due to the dynamic nature of IoT measurement knowledge graphs, it is to be expected that new measurements are made after the initial IoT measurement knowledge graphs are created. Therefore, we investigate the possibility of estimating embedding representations for new entities that are based off the existing entity representations. We introduce the embedding estimation method, which used the numerical attributes of entities, the measurement values, to find entities similar to the new entity. Then by averaging the embedding representations of those entities, it creates a new, estimated, embedding for the new entity. We perform experiments to test the embedding estimation method, and show that there is a trade-off between time saved by not re-training the entire pipeline, but also a loss in accuracy, with the estimated embeddings. This thesis offers the first focused research into IoT measurement knowledge graph. We provide insight into how IoT measurement knowledge graph can be created and how they differ from regular knowledge graphs, specifically in being numerical, dynamic, and shallow knowledge graphs. We suggest and validate remedies to the shallowness of IoT measurement knowledge graphs, through the semantic enrichments. Furthermore we show how the numerical aspect can be a strength, through the embedding estimation method, which can help with the dynamic nature of IoT measurement knowledge graph.