Speaking in CIM: The 26 Year Old Language of the Power Grid

If you are finding the CIM confusing, rest assured that you are not alone. It is very unfortunate that such an inconvenient format has emerged for transferring this type of data.

—Anonymous CIM user and developer

Image source: The Tower of Babel by Pieter Bruegel the Elder (1563)

If These Electrons Could Talk…

As the power grid undergoes modernization and companies wield the flaming sword of “digital transformation,” it’s becoming increasingly clear that the world’s largest and most complex machine has a major data problem.

To say that this problem is of Biblical proportions would be a little bit dramatic, but it is reminiscent of the Tower of Babel.

The problem is as follows: the modern grid demands more communication amongst its parts, but everyone is speaking a different language.

Information is scattered across closed-source, proprietary software, and power systems engineers are forced to become data engineers, data scientists, maintain complex ETL pipelines, and employ a vast collection of Python scripts to manipulate and manage their data.

Below is a simplified overview of the smart grid that we can use to explore the scale of this information problem.

Not too long ago, these domains were much more isolated. Each team stored their respective data in a different format, and while communication definitely still did exist between departments, it was much less frequent, required more manual work, and was often ad hoc in nature.

But things are different now. On top of the already complex physical grid, we now have communication networks layered in; networks that reflect and, in some ways, amplify the complexity of the grid itself.

These networks provide automated, structured communication between the various areas of grid operations.

They’ve become the major arteries of the smart grid.

Now, I want to point out that, theoretically, this graphic could represent just one electric utility company. Each network, signified by the cloud shapes in the diagram, is the internal network for just a single, local utility, and its existence does not necessarily mean that the neighboring utility would have access to or understand that network in any meaningful way.

Continuing with this theoretical utility, when the Transmission department and the Operations department need to communicate, they must develop a brand new communication channel, and define some kind of translation between their datasets. “Our substations have data that looks like ‘x, y, z’”, “when we say ‘tower’ we mean…”.

This is a dirty job. It’s full of what can euphemistically be referred to as “data wrangling.” Ancient databases are resuscitated and spreadsheets are pruned as teams try to add their data to the network. The role of the electrical engineer is blurred as they become de facto Pythonistas and ETL experts.

And the grid is still changing! These networks will need to adapt and grow as the grid does. Whether that’s new FERC orders, or co-simulation, or nuclear fusion, or any other energy technology or business requirement that surfaces.

Over and over, for each utility, for each department, and for each new piece of software, this work repeats.

If nothing else, this is a very expensive problem. Expensive in this context might mean actual man-hours and $$$ lost to inefficiencies, or it could mean an actual risk to the grid. E.g., if the smart grid relies on communication channels, but those communication channels are not effective, it could, say, result in a mishandling of a voltage instability and ultimately result in a blackout.

At its core, this is an information technology issue—one that architects of the internet thought a lot about.

The power grid is creating new communication channels, but we have no language with which to speak and send messages across.

It’s almost like we have these big, beautiful cellular towers, but no cell phones, and no SMS messaging standards. We don’t even have English.

This problem sits at the intersection of electrical engineering, human language, and information theory. The solution is going to have to be something grand.

Interconnected Networks

To understand this problem better, we can look at another complex system which has managed to more elegantly navigate its own information challenges. And there’s only one technology that’s even remotely comparable to the sprawling and complex nature of the power grid—the World Wide Web.

These two systems of systems of systems (the internet and the power grid) exist as shared infrastructure built for the common good. They have physical components which cross geopolitical boundaries, their operation requires collaboration from teams working in a multitude of different disciplines and languages, utilizing different software, and they’ve both become an essential part of modern life.

But unlike the power grid, which was built to reliably provide electricity, the internet was built specifically for communication. It is the progeny of information exchange. It is, quite literally, interconnected (communication) networks.

As the power grid evolves, I think that it could learn quite a lot from its younger cousin, the internet.

We take it for granted, but the internet has solved (or at least alleviated) the unsolvable problem of human communication.

And it has done so with web standards.

Web standards are “formal, non-proprietary standards and other technical specifications that define and describe aspects of the World Wide Web”.

These standards (alongside massive adoption and cooperation between many entities) have defined not only clear communication channels, but also languages with which to communicate across these channels. Thanks to open web standards we can open a web browser, map a domain name to an IP address, read data from the web server, interpret the HTML and DOM in our browser, and do all that just to read a blog post on simplethread.com.

The Common Information Model

If the smart power grid and all of its related networks are going to work, it’s going to need its own set of standards.

It’s not going to be easy, though. The internet works, but it’s not simple. (If you’ve ever had to do anything with DNS or JavaScript, you know what I’m talking about.) Some of it will be unnecessary complexity (re: JavaScript), but there is real, essential complexity at the root of these interconnected networks, and that will have to be captured in these standards.

Introducing: the CIM.

The Common Information Model, commonly referred to as the CIM (pronounced like sim, as in simple, although it’s anything but simple), has become sort of an obsession of mine. And I’m not kidding about that—my wife will tell you.

So, all of the problems that we’ve discussed up to this point? That’s pretty much why the CIM exists.

The CIM aims to be that standard we’ve been looking for. It’s the definition of the communication channels, the messaging protocols, and the common language.

So that sounds great, but what is the CIM, really?

Unfortunately, there is no easy answer to that question. The CIM, by its very nature, is abstract, and I find that no single definition does it justice.

Also, the information on the CIM that’s out there is quite difficult to find and hard to understand.

That being said, I’ll try and synthesize what I know, and we can at least get a glimpse of the tip of the iceberg.

The IEC defines the CIM as, “an abstract model that represents all the major objects in an electric utility enterprise typically involved in utility operations.”

It’s important to call out that the CIM not only defines all major objects of the grid but also their relationships, i.e., a substation might contain buses, a transmission line is made up of towers and conductors, etc.

It does this without being proprietary, and without adhering to any one vendor, tool, department, or data format! So for example, it doesn’t define a SQL schema, as that would be too restrictive and would imply use in a specific database. Instead, it needs to be able to serve not only database applications, but also EMS, power simulation software, interconnection queues, energy markets, GIS—it has to do it all.

Hence, the CIM was developed as an abstract model. A meta-model of sorts. Sometimes it’s called a canonical model. It’s also referred to as a common “ontology.”

This ontology lives (or, in CIM speak, is semantically represented) as a massive UML model. When I say massive, I mean it’s more than 2 GB, just for the model itself.

In practice, because the model is so large and unwieldy, a subset of the model is almost always used. There’s even a term for this: a CIM Profile.

Below is a very, very small subset of the CIM ontology (a.k.a., a CIM profile), generated from a CIM tool being developed at Pacific Northwest National Labs (PNNL).

All of this probably sounds quite strange (at least, it did for me). Canonical models, semantics, UML, ontology… As a software engineer, I wasn’t familiar with these things! This isn’t how programming or the internet or any of this works!

And that’s true. This language and these tools are actually vestiges of the semantic web.

The semantic web was part of Tim Berners-Lee’s grand vision for the internet, but it never quite panned out.

(Tim Berners-Lee formed the W3C, aka web standards, so, yeah, he’s a pretty important figure when it comes to communication standards.)

The idea behind the semantic web was that semantics (i.e., the nature of the data) are encoded alongside the data itself. At a very high level, it defines a way for me to say “ruby (the programming language, not the precious gem)”, or “apple (the tech company, not the fruit)”, leaving no room for ambiguity.

The semantic web requires unique technology:

RDF (Resource Description Framework) for describing data.
OWL (Web Ontology Language) for defining vocabularies.
SPARQL for querying semantic data.

Today, RDF, OWL, and SPARQL are not very popular. We’re much more familiar with structured data through REST or GraphQL APIs. But the semantic web tools of RDF, OWL, and SPARQL are all still very much a part of working with CIM.

And much like the semantic web, the CIM is a standard that never quite reached critical mass. To steal an analogy from chemistry, the CIM never reached the activation energy to make the reaction (adoption) happen.

Now the CIM lives in a liminal space, along with the semantic web, on an island of misfit standards.

26 Years Later: Why Isn’t Everyone Fluent in CIM?

One of my first thoughts when I uncovered just how complex the CIM was, the tools required to work with the model, and the semantic web-iness of it all was, ”Why isn’t there anything better?” And the next question was, “How hard would it be to make a new standard?”

Since then I’ve talked to some really smart people who have been in the CIM world for much longer than myself, and the consensus seems to be that, well, it’s all we’ve got. It’s by far the most mature and widely adopted non-proprietary standard, and it would not be worthwhile to create a new standard. Rather, the best course of action is to improve upon the current ontology.

While there are certainly some users of the CIM (especially in Europe, with ENTSO-E being a driving force for standardization) the Common Information Model has not seen wide adoption and success.

I think there are a couple reasons why.

The first has to do with the “activation energy” analogy.

The electric utility industry is big, but it’s far, far smaller than the World Wide Web. Tim Berners-Lee and others founded the W3C in October, 1994, started drafting web standards at MIT, and a few months later the dot-com boom was fully underway. The authors of standards were able to ride the waves of the dot-com boom, their systems saw rapid growth, there was much interest, pseudo-unlimited funding, and widespread adoption. They reached and far surpassed the activation energy required for this reaction.

The electric grid is smaller, more regulated, and it simply can not grow that fast. Grid operators are practically the opposite of the agile development mantra, “move fast and break things.” They move slowly, make safe decisions, and keep the lights on.

It’s for these same reasons that there’s very little open source software in this industry. There’s an interesting relationship between open standards and open source software. Both require herd adoption to reach critical mass. But despite some valiant efforts on behalf of open source developers, the industry, over and over, fails to escape the clutches of proprietary software.

The reality is that standards, as well as moving to open source software, is risky and at the end of the day is not a high priority. There are so many other risks to manage. Why throw in an obtuse, perhaps idealistic standard that no one is asking for? And why migrate away from a proprietary system which has been thoroughly vetted, engineers know how to use, and has been successfully keeping the electrons flowing for many years?

How Long Will It Take?

The grid is currently undergoing a “digital transformation”. But here’s the thing: this didn’t start a few years ago with renewable energy, electric vehicles, digital twinning, nor DERs.

The CIM was officially published in 1999, as part of the IEC 61970 series, under the Technical Committee 57 (TC57) of the IEC, because utilities saw that there was a bottleneck in proprietary software. This was at the peak of the dot-com boom over 25 years ago. Organizations like the CIM Users Group (CIMug) have been around since 2005 trying to improve the CIM and increase adoption.

How long will it take to see widespread adoption across the globe? At this rate, maybe another 25 years? Or will there be a newcomer, or some other X factor, which causes a new chemical reaction to take place?

The character encounter graphic from Super Smash Bros. Melee

The Future of Power Grid Standards

I strongly believe that the power grid needs to adopt standards if the smart grid is going to work at all.

I don’t love the CIM (romantically or academically), but trusting in the wisdom of crowds, I might just need to swallow that pill and accept that this is the standard.

It does give me hope that semantic web technology is being used by companies like Netflix to develop their own ontology and semantic models. Maybe it’s not so crazy after all.

But if we’re going to incorporate the CIM as the grid modernizes, it’s going to require a stronger catalyst, sustained energy input, and deeper systemic change to overcome the power grid’s own activation barrier.

What does that look like?

Well, I think one of the most important things that we can do is spread awareness and clarity.

PNNL is doing great work on the clarity-front with their CIM toolbox. Bringing the CIM out of obscurity and into normal Python libraries that normal developers can use and understand is inexplicably valuable. They’re also releasing valuable insights at the intersection of software engineering and CIM standards with publications like A Power Application Developer’s Guide to the Common Information Model.

EPRI publishes a very comprehensive CIM primer. It’s currently on its ninth edition.

At the end of the day, not everyone has to be a CIM aficionado. Continuing the internet analogy: we’re not all DNS experts or experts on the JavaScript abstract syntax tree. Instead, researchers and specialists bring that knowledge to us and make it usable through products.

The CIM is a base-layer. It has the potential to be the foundation that software products, data schemas, network model file types, and more can all use as a shared language. These tools can speak CIM and most of us can remain none the wiser.

Let’s Talk About CIM!

A lot of time and research went into this post, even though I didn’t really get into the really technical stuff, like actually building products with CIM, building CIM profiles, using SPARQL, different flavors of XML, etc.

If you’re interested in more technical details, please let me know and I’d love to continue exploring CIM.

I’d like to call out a few people and resources that helped demystify CIM or otherwise contributed to this post:

Dan Kopin @ VELCO and Yang Feng @ Siemens as their presentation at DTECH in Dallas, in March of this year, kicked all of this off for me
Bill Meehan @ ESRI and his CIMplified series
Alex Anderson @ PNNL for responding to my messages and pointing me to CIM resources
Richard Lincoln for answering my emails, contributions to open source, and CIM research
Eric Meier @ ERCOT for actual thought leadership(!) happening on LinkedIn in the power systems world
Colin Gault @ Reactive Technologies for talking to me at DTECH about CIM, co-simulation, digital twins, and more

Please reach out if you have any questions, corrections, or additions you’d like to make to this conversation. Happy standardizing!

Loved the article? Hated it? Didn’t even read it?

We’d love to hear from you.

Reach Out

Speaking in CIM: The 26 Year Old Language of the Power Grid

If These Electrons Could Talk…

Interconnected Networks

The Common Information Model

26 Years Later: Why Isn’t Everyone Fluent in CIM?

How Long Will It Take?

The Future of Power Grid Standards

Let’s Talk About CIM!

Leave a comment

Leave a Reply Cancel reply

More Insights

From Scripts to Scale

Definining Software Profit Levers for Electric Utilities

Hosting Capacity Maps: How to Make Them Time-Aware, Trustworthy, and Scalable

The Fantastic Machine – Part 2 – Generation

Interested in empowering your energy and utility operations?