Hey! Quick update on the technical situation of EUDAMED: We are screwed 😂
Okay, hold on, let me explain: Building BEUDAMED, a better EUDAMED database, gave me a unique opportunity to “look behind the curtains” of what’s going on in EUDAMED on a technical level. And boy, are you in for some ugly surprises.
Now, I could start rambling about technical details, but I won’t bore you and instead only focus on the most glaring one: Duplicate and missing certificates.
What do I mean by that? Let me show you some numbers, and then let’s discuss what they mean.
First off, if you navigate to the “Certificates” search form in EUDAMED, you are able to, you guessed it, search for all certificates. Now, if you have nothing better to do with your life, like I do, and go through all pages of the search results, like I did, you will notice that there are 442 certificates listed in total, at the time of writing.
Okay, 442 certificates. Cool.
Wait, what? That doesn’t sound right. Shouldn’t there be more? On LinkedIn, every few days a random company announces yet another random MDR certificate. So 442 can’t be right. We should have more.
Enter the first problem: “Disappearing certificates”.
“Disappearing Certificates” On EUDAMED
Yes, that’s true! There are more certificates, but they’re not displayed in the “Certificates” search results, for weird reasons only the EUDAMED decision makers know.
Instead, you can only find them by randomly finding a device which has an associated certificate. And then, you can only view that certificate on the device page on EUDAMED, but again, not via the certificate search form and search results.
Now, if you still have nothing better to do with your life, like I do, and write software to go through all ~500k device pages, like I did, you will notice that there are another 3.843 certificates hidden in device pages. Yep.
So that’s what I call the “disappearing certificates” – in other words, the certificates search results in EUDAMED only show you around 10% of all certificates. All others have “disappeared” and are only randomly findable via their respective device pages.
Okay – so 90% “disappearing certificates”. That’s cool. But it gets worse. What else do we have?
The Certificate Inflation (Duplicate Certificates)
If you have some regulatory knowledge (can’t blame you if you don’t), you might now think: “Hmm, but sometimes multiple devices are covered by the same certificate, right? What about that?” – and that’s a really good thought.
So let’s say you navigate to the page of Device A on EUDAMED, and it shows a certificate with the number “TÃœV-SÃœD-42”. Then, you navigate to the page of Device B on EUDAMED, and it again shows a certificate with the number “TÃœV-SÃœD-42”. As certificate numbers are unique, you will now think “okay cool, so both of these devices are covered by the same certificate”, and you would be correct!
However – does the EUDAMED data actually contain this information on a technical level? You see, in technical terms, you need to have some logic in a database like EUDAMED to actually “know” that two mentions of one certificate actually relate to the same certificate.
There are two ways software engineers can build “relations” like this into a database:
- Engineers add “references”: The certificate TÃœV-SÃœD-42 would only be stored once, and both Device A and Device B get a “reference” to this certificate. This is smart, because you can do smart things like “show me all devices covered by certificate TÃœV-SÃœD-42”.
- Or: Engineers are lazy and/or technically incompetent and/or completely unaware of “references”, and revert to “copy-pasting” instead: The certificate TÃœV-SÃœD-42 will be stored alongside every device, and all of its information will be copy-pasted for every device which has this certificate. This is very dumb, because it prohibits you from doing smart things like “show me all devices covered by certificate TÃœV-SÃœD-42”.
By now, you’re probably already guessing which approach EUDAMED chose.
Here’s a hint: The EUDAMED architects didn’t go with the references.
So how big is the damage we’re looking at?
Let’s take the example from above: In the crappy copy-pasting scenario, you have two devices which actually belong to the same certificate. Without references, the database now contains two certificates, even though in reality there is only one. I call this the problem of “duplicate certificates”.
You might now ask “How many duplicate certificates do we have in EUDAMED?”, but that’s not necessarily the right question to ask. First, we need to ask ourselves: “How many total certificate entries do we have in EUDAMED, including duplicates?”
Okay. Are you ready for this?
160,118.
Yes: One hundred and sixty thousand one hundred and eighteen certificate entries.
Mind you, in reality, we have somewhere around 4k certificates, not 160k. Mathematically speaking, this implies that each certificate has around 39 duplicate entries in the EUDAMED database – on average.
This is an epic mess of biblical proportions. Who is going to clean this up when clearly no one feels responsible?
Well, I guess.. we did. Because, as you noticed, I was able to give you the number of around 4k “real” certificates, and that number required some significant engineering on our side to clean up the epic mess of EUDAMED data.
As a side note, the funny thing is that no one at EUDAMED would be able to give you this number, because their data is so messy that they are literally not able to know how many certificates they have.
Okay – so ~156k duplicate certificates. That’s cool. But we’re still not done. There’s more to the duplicate certificates!
It Gets Even Worse
You see, our approach above relied on one assumption: We assume that, when certificate data is copy-pasted between devices, its certificate number is entered in exactly the same way, without formatting issues. So, every certificate would need exactly the number TÃœV-SÃœD-42 for us to detect that yes, this is the same certificate – not TUEV-SUED-42, TUV_SUD_42, etc.
Yeah, yeah.. you know where I’m going next.
As soon as someone mistypes or makes a copy-pasting error like in the examples above, or even only adds an empty space at the end, it looks like a new certificate!
Let’s take a quick look into the EUDAMED certificate data. Here are some certificate numbers:
In the examples above, you can see that certificate numbers have been entered as duplicates – some entries have a whitespace at the end while others have not. There’s no technical way to reliably detect this as the messiness is unpredictable: Sure, whitespace is easy, but scenarios like TUEV vs. TÃœV are not. It’s a gigantic mess, and I think it can only manually be cleaned up.
And while no one at the EU is taking responsibility for EUDAMED, how likely are we going to find someone who will manually clean up 160k certificate entries?
Okay, so 4k certificates, but we’re not sure, because of data entry errors. That’s cool. What’s next?
Certificate Numbers Unknown
So, truth be told: We also don’t know how many unique certificates exist in the data, because of the data entry issues mentioned above. Likely less than 4k. But we simply don’t know. And, to your knowledge, our data is the cleanest, yet we don’t know, so.. likely no human currently knows how many real certificates actually exist in EUDAMED right now.
I feel like there’s something deeply philosophical about this – something along the lines of “some things are just really hard for humans to measure”. Examples:
- Events inside a black hole
- The temperature of absolute zero
- The exact center of an electron
Now, we can add an entry to this list:
- The number of certificates in EUDAMED
There’s one silver lining though, which brings me to the actual topic I wanted to write about today (man, I really digressed): You can access the cleaned data now in BEUDAMED – here’s the complete list of certificates, for example. So the huge irony is that BEUDAMED now is not only a better user experience, it’s now also more complete and coherent than EUDAMED now.
BEUDAMED November 2024 Update
This month’s BEUDAMED update is the first major update of BEUDAMED since we launched it one month ago.
Let’s see which features EUDAMED shipped in the meantime:
- None.
Cool, with that out of the way, let’s look at what changed in BEUDAMED. I think these two screenshots illustrate the differences pretty well (before and after):
So while the initial BEUDAMED release already covered the most important data (devices, manufacturers, etc.), it wasn’t complete yet. This is now fixed – we’ve got all the “other” stuff now in there, too – most importantly Authorized Representatives, Basic UDI-DIs and Certificates. But also Importers and System Producers.
Also, its search interface is brutally simple and a joy to use, while its based on so much cleaner data, as we’ve seen above. Now, with BEUDAMED, you can literally find answers to questions which are impossible to answer with EUDAMED:
- Which devices are covered by certificate X? (Example)
- How many manufacturers and devices does each Authorized Representative cover? (Crazy example – more on this another time)
- Which notified bodies do software device companies choose? (Example – look at the numbers in the left menu)
It’s tremendously cool. And we continue to offer BEUDAMED 100% for free, while developing it with our own resources.
Someone asked me if I’d want to apply for EU funding for BEUDAMED. Hah! Good joke..
Until next time, and in the meantime, enjoy BEUDAMED. If you have any feedback to share, I’d be more than happy to hear it! Leave a comment below, or contact us via the contact form on this website (bottom right corner).
4 comments
A.B.
Hi,
thank you for the reply and the passion you put into this. I would be interested in learning more about EUDAMED: what is the underlying technology? Why was it chosen? What are the specifications or use cases that were given to the tech team? Any constraints? Who designed it and is there anyone who could explain their technical decisions?
I don’t have the answers, so it’s hard for me to argue. My point was that embedding data can be a valid design choice depending on the use case. I didn’t say it’s the right choice in the context of EUDAMED.
For the EUDAMED team, maybe there was no requirement to display all certificates on the website, maybe there was a requirement to display max 442 certificates, maybe the team was under (time) pressure or running out of money and had to do poor design decisions, …
Again, I know too little about EUDAMED and its technology and its team.
While I share your frustration with the issues you outlined, I have to respectfully disagree with your characterization of the engineering team (I don’t know who developed it) as “lazy or incompetent”. IT projects like this often face constraints and pressures that aren’t always visible from the outside. These could include tight deadlines, resource limitations, or even conflicting requirements from stakeholders, all of which can lead to suboptimal design choices.
Rather than assuming a lack of skill or effort, I think it’s more productive to critique the design itself and explore how it could be improved in future iterations.
By the way, I checked the EUDAMED certificates page earlier and it returned 815 certificates 😀
Dr. Oliver Eidel
Thanks for your reply!
I think you do have valid points there! I mean, I tend to be overly critical, and I tend to assume incompetence too early. And, as I’ve noticed while working in software development teams, it’s hardly ever the fault of the actual software developers building things – instead, problem most often originate from bad management decisions. Or, even if there’s an incompetent engineer, the decision to hire that person is also somewhat a management mistake.
Still – in the meantime, we’ve received some documents from the EU about EUDAMED’s resources. Given that they went over budget and spent €9M for 2023 alone (9 million!), €300k in hosting costs (also in 2023 alone!) and their total team size is 57 people, I mean.. at the very least, it’s not viable to argue that they’re starved of resources.
A.B.
Just a comment regarding what you call the certificate inflation: how you design a database very much depends on the use cases, i.e. read and write patterns. I know nothing about the technical implementation of the EUDAMED database, but there might have been good reasons why the engineers chose to store certificate information alongside the device instead of accessing this information via a reference, e.g. performance, avoiding JOINS. Well, I don’t know the reasons, but I wouldn’t just call them “lazy and/or technically incompetent”. Mongodb has a good article about references vs. embedded data: https://www.mongodb.com/docs/manual/data-modeling/concepts/embedding-vs-references/#std-label-embedding-vs-references
Dr. Oliver Eidel
Hi,
Short answer: No, it does not depend on the use case. In other words, there is no excuse to duplicate data in this blatantly amateurish way. In Postgres (or any SQL database, really), adding a foreign key to an indexed column and then querying it via a join is nearly free, performance-wise. Hell, even in MongoDB, you can add references with hardly any performance impact, if the referenced column has an index!
You’d have to search long and hard for a situation in which “avoiding JOINs” in a SQL database is a reasonable course of action. Or, in other words, you’d need to find a scenario in which a JOIN significantly impacts your performance. In the context for EUDAMED, I can’t think of one – after all, we’re looking at rather simple JOINs here.
And we even have the real-world data to back it up! BEUDAMED run’s much faster than EUDAMED – instead of taking 10-20 seconds for a search query, it takes less than once second. That’s a 10-100x improvement in speed – without the need for any certificate duplication.
So I stand by my point: Unless proven otherwise, I think the EUDAMED data model was designed by lazy or technically incompetent people. Again – I’d be happy to stand corrected. Like, if you could show me one way in which the EUDAMED data model provides a real-world, huge performance advantage over indexed SQL joins.. sure. It’s just that we’re not seeing it right now.
Furthermore, MongoDB provides no referential integrity – so you run the risk of your references becoming “stale” once the underlying item has been deleted. This is a pretty good essay on why using MongoDB for “referential-heavy” data is not a good idea.