We support you in implementing your projects. Data science is the key competence that will help you to shape the future.
With proven algorithms, models and analysis methods from practice and our innovative new modeling from research, we can support you in your projects.
Whether statistical analysis, machine learning, artificial intelligence or knowledge graphs, you can build on our experience and create new opportunities.
We believe that effective, data-rich solutions are the basis for maximizing existing resources. More and more companies understand that the real value lies on the one hand in the data itself, and on the other in the way that data is interconnected. Connections and relationships make everything coherent.
A reliable database and objective analyzes, well visualized, provide an important basis for decision-making. Compared to traditional databases, graph databases store linked data. These help you to delve deeper into data analysis, to analyze the connections between data and, above all, to visualize them.
AI-based technologies provide insights and optimize your processes. Manufacturing processes are optimized in a way that has never been seen before, and product reliability, quality, safety and yield are improved.
We advise comprehensively and help you to tap your potential. Data modernization helps create a compliant, accessible, and useful data foundation to enable business decisions. Obtaining, structuring, analyzing and protecting data with our know-how helps you to lay the foundations for successful work.
Graph databases are all about searching and discovering what happens naturally. When querying, the graph-based database takes into account the entire structure of the available connected data and provides you with more detailed results. In addition, they receive, for example, a deeper insight into the relationships between entire industries in order to optimize their business models and develop new products on the basis of these findings.
High Performance Data Analytics (HPDA) uses High Performance Computing (HPC) in combination with data analysis to identify patterns and insights. With the advent of high performance cloud computing and data analysis, extremely large amounts of data could be queried in real time.
Predictive analytics is the process of using past data to predict future trends. At its simplest, this means using historical data to build a model that captures important patterns. The model can then use new data to predict future developments.
The problem with data semantics is how to establish and maintain the correspondence between a data source and its intended subject. Get deep insights from multiple data sources to improve your company's decision-making ability.
Intelligent data management as the basis for data science and the use of learner systems. New types of analytics such as machine learning about new data sources stored in the data lake help you identify and exploit business growth opportunities faster, increase productivity, proactively maintain devices and make informed decisions.
Performance Intelligence provides more operational agility and resilience for industries by combining the power of information and artificial intelligence (AI) with human insights.
Our business intelligence tools and services help you turn data into actionable insights that feed into your company's strategic and tactical business decisions.
In addition to numbers, text is an essential part of business activity. It is important to uncover insights such as sentiment analysis, entities, relationships and key phrases in unstructured text. Using NLP, AI and graph databases, it is possible to convert these findings into strategies and actions.
The use of terms in connection with passivhaus and in particular the use of the term “build” reveals the discussions that are taking place and the emphasis that is placed on subjects. Here we briefly present an analysis based on roughly two month of twitter data.
Let us look at the frequency distribution of terms that are used in conjunction with the term “build”. This distribution is of the power law type! Such an observation seems to be typical for co-occurrence of hashtags and can take the form of a small-world network. This is shown in the inset of the distribution figure in a log-log plot. The red line shows the power law fit with negative exponent b.
Recall that a power law distribution exhibits the property of scale invariance. We could have expected that the distribution of terms is an exponential, a very fast decay of the terms, i.e., only very few terms are relevant, i.e., have a high frequency of usage and the rest occur much less frequent.
A scale invariant distribution has the property that all distributions with the same exponent b are indeed the same, in the sense that if you rescale the x-axis, i.e. s*x, then the distribution remains invariant. Let us take the term “health”. Then the analysis shows the same exponent b as for the term “build”!
To demonstrate the point let us look at the “build” distribution in terms of a knowledge graph, as shown in Figure 2. This looks very messy and does not show much information and structure. Let us drill down into this and only take those terms (nodes in the graph) that have at least 10 connections to other terms. Those terms are the most frequently used terms in combination (see Figure 3).
Figure 3 shows much more structure. The size of the text indicates the amount of connectivity. Hence, greenbuilding, no surprise here, takes center stage, beside the different spellings like “building”, “buildings”, etc. that we lump together into one bucket.
Next in line are “sustainablebuilding(s)” and “healthybuilding(s)”. This is an interesting observation. A distinction is made with respect to health. Indeed, a further analysis reveals that one line of discussion revolves around issues like air quality in passivhaus buildings. In this context also the term "buildingperformance”, “buildingmaterials” needs to be seen that follows in importance of term usage.
This is followed in importance by “energyefficiency” with a large block like “buildingscience”, “climateaction”, “buildingdesign” and “netzero” all emphasizing the impact on our climate and the consumption of energy. A sentiment analysis shows that much of the negative sentiment is concerned with health aspects in buildings, in particular in the educational building sector.
The health aspect is picked up again by the term “ventilation” that comes next in line in importance. In Figure 4 the rather intricate discussion on health is shown.
It clearly shows the concerns, as well as the issues that are tackled. The main issues are “publichealth”, “healthyschools”, “mentalhealth” and “wellness”. Hence, especially public buildings, like schools, where a large drive towards passivhaus design and buildings was and is directed leads in the discussion. However, the central discussion revolves around energy consumption and climate impact rather than on whether the buildings are healthy.
How best to demonstrate the power of data analytics? Let us explore a particular application field to demonstrate the power of the techniques and showcase business applications.
The domain of passivhaus cover a broad range of topics. These range from ecological, social and societal to engineering topics. Even within one topic a rather vast data lake is awaiting anybody just wanting to cover the ground. Much of the data is in textual form. It tends to be rather time consuming to obtain exact knowledge from the massive literature, and it is necessary to transform the literature into structured knowledge to meet the efficient management of the data as well the extraction of business knowledge and action. People have found that it is rather inefficient to obtain useful information and knowledge from the massive and noisy plaintext that need to be ingested for everyday business, as well as domain research activities to stay ahead of the competition. This is where semantic text analysis, artificial intelligence and knowledge graphs come into play for data integration, analysis and interpretation.
In a series of short articles, we show how data analysis techniques are applied. Business cases for semantic data analysis and in particular knowledge graphs are shown. We will further plunge into the social aspect by analysing the social data that comes with the topic.
The starting point is information (data) from a variety of heterogeneous sources and heterogeneous data formats. These sources and formats can be product data sheets, articles, scientific papers, opinion pieces, tweets, blog entries, … or even numerical datasets. The possibilities are sheer endless depending on the type of application of the data that is required. Most of the time these data sources are scattered within the company resting in silos not accessible or are scattered across different types of physical sources and data formats. We would like to aggregate the many sources into one ‘database’ and make it accessible such that we can, for example, draw conclusions from the ensuing structure, infer dependencies, query the result, etc. More on this in the later parts of this series. The amount of data is such that we are unable to manually compile the data and cannot manually curate and annotate them as well.
Furthermore, we want to go beyond the possibilities that generic databases offer. Recall that for a database you have to specify tables, etc, i.e., we have to know and specify in advance every nitty-gritty bit of detail. So, there is not anything new that we get. Imagine we only specify partially a framework, such as a namespace that would grow as we go along leaving the structure itself to evolve. Eventually, after the ‘database’ is filling up we can start to detect structures and draw conclusions from the structure, i.e. the hidden interconnectivity of data that was not obvious in the first place or premeditated.
Let us kick off this series by naming some use-cases for data analysis and knowledge graphs. Of course, we have not defined yet what we understand by a knowledge graph. Operationally, we think of a knowledge graph as a large-scale semantic network that can realize the structured storage of complex interconnected data. Note that there are different layers of the meaning of ‘semantic’ here, however, at the moment we postpone this discussion that will be helpful for the applications in different business cases.
Now let’s talk application once we have collected and processed the data. So here is a collection of some obvious use cases
The above-mentioned use cases are passive by their nature. The knowledge graph is an aggregated resource waiting to be tapped. In the active use case of a knowledge graph, the graph itself is part of an application. Consider a passivhaus smart home. In this scenario, the knowledge graph is part of the network of sensors and devices and actively participates in the regulation, say of the heating or ventilation of the home together perhaps with AI (artificial intelligence) building on the acquired knowledge.
One last application that we briefly touch upon is Opinion mining across different sources. Often one needs to include an opinion in the decision and design process besides facts for a sustainable structural design. May this be in the material that one uses in a passivhaus construction or the design itself. Knowledge graphs help to identify opinion aggregating information and opinion by the process of relation extraction and acquisition and its enrichment of with network data.
In the second part of this series, we will discuss the different data sources and the extraction of semantic data to build a knowledge graph and what it entails upfront.