Solving the Data Access Challenge in Healthcare AI Using Confidential Computing

BKAI Blog Images

Solving the Data Access Challenge in Healthcare AI Using Confidential Computing

Many companies claim to provide secure environments for developing artificial intelligence-based algorithms and models for healthcare use. But the pace of development has remained painfully slow. A major challenge is that most algorithm developers only have access to limited, often narrow datasets to validate and train their models. As a result, the models are not generalizable across different clinical settings.

We launched BeeKeeperAI to bridge the gap between algorithm developers and data stewards, both of whom have valid concerns about sharing resources. Algorithm developers need diverse datasets to develop, train, and validate their models. But they need to be sure that their code and algorithm intellectual property is protected in the process. Data stewards may be open to sharing limited patient information with algorithm developers, but they are also concerned about unintentionally exposing private health information. It can take as long as 18 months to iron out an agreement that both parties are comfortable with. That is far too slow. And the costs add up quickly.

Even large academic institutions are not exempt. At the Center for Digital Health Innovation at University of California San Francisco, where I and other members of the BeeKeeperAI founding team worked, we struggled to find diverse datasets to train generalizable point of care algorithms. Our algorithms worked well with internal patients but failed to perform as well when they were used in other clinical settings. 

Based on our experiences and conversations with other data stewards and algorithm developers, we developed a proof of concept and drew up the plans for what is now BeeKeeperAI. We envisioned providing a way for data stewards and algorithm developers to connect with each other while protecting their assets. 

The first episode of the HiveCast, our new podcast on all things healthcare AI, covers BeeKeeperAI’s origins and how our technology can accelerate the development of more generalizable AI models and algorithms. In it, I talk about how our technology leverages Microsoft Azure’s confidential computing technology along with powerful encryption tools to provide the highest level of security for training and validating healthcare AI models 

The episode covers how our technology works in detail but here are the highlights. Both the data and algorithms are encrypted by their owners and placed in containers. The containers are then uploaded into secure computing enclaves in the data steward’s PHI compliant cloud environment. The data steward and algorithm developer have individual encryption keys that they never have to share with each other. 

Once the data and algorithm are safely in the enclave, both are decrypted and allowed to interact. The algorithm owner receives a report detailing the algorithm performance characteristics and high-level meta data. That performance report is the only thing that leaves the secure enclave. After the compute process is complete, the data and algorithm can either be left in the secure enclave for re-use or destroyed, depending on the use case. Importantly, the sensitive, protected data never leaves the data steward’s cloud and is never seen by any other party – not even BeeKeeper.

By guaranteeing the safety of the algorithms and data throughout the process, BeeKeeperAI can cut down on the time needed to draw up lengthy data access contracts for model training and validation. As I note in the podcast, we are the only technology around that allows data stewards to retain control of their data while also making it available for research and development. We have made it really easy to load up algorithms and data into the enclaves.  We also have a much smaller attack surface than other computing environments, so our platform is less susceptible to malicious attacks. 

Over the next few years, I expect to see more rapid development of higher-quality healthcare AI models and algorithms on a much shorter time scale than was previously possible and at a fraction of current costs. One application area that we at BeeKeeperAI are particularly excited about is rare diseases. With access to a better caliber of algorithms, clinicians will be able diagnose these conditions more quickly, saving patients from years of costly and ineffective treatments. Similarly, pharmaceutical companies will be able to identify subgroups of patients that are most likely to benefit from clinical trials. 

To listen to the first episode of the HiveCast, click here. 

Want to know more about BeeKeeperAI? Contact us here.


There are no comments yet. Be the first one to leave a comment!