X
Case Study

SURA: Generative AI to find new ways of managing health

At Datalytics — based on OpenAI technology — in six months, we developed a generative AI solution that allows healthcare professionals to use data to diagnose diseases and understand the genetic information of populations.

Client: SURA – A leading insurance provider in LATAM. It has over 10K employees across the region and more than 80 years of history.

Tecnología: Azure Open AI, Azure BotService, Azure Cosmos, Azure function, AppService, Storage Account, Azure Container Instance

Time: Six months.

SURA is the leading insurance company in Colombia and one of the most important in LATAM. It has over 80 years of history and currently holds health data on more than 5 million affiliated individuals.

In 2023, SURA inaugurated its Omics Science Center in the city of Medellín. From there, they focus on exploring people’s DNA and analyzing the population’s genomic data to predict, prevent, and diagnose diseases, as well as to discover new ways to manage health. The goal is to develop personalized medicine, empowering patients to take control of their well-being and make data-driven decisions.

This type of science—including genomics—generates a massive amount of information about human DNA that must be processed in order to be understood.

How are genetic data processed?

DNA is extracted from a blood sample, which provides a lot of information about the individual. This sample is then passed through sequencing equipment, which generates a list of information in four letters that encode the DNA.

In a human being, a single genome contains three billion of these letters. This is an almost unmanageable amount of data that must be processed and analyzed to determine, for example, if a patient has a disease, why they have it, whether a genetic explanation can be provided, what diseases they may develop in the future, etc.

The interpretation of this data is highly complex and cannot be done by just any healthcare professional. Only specialists in genetics, molecular biology, and related fields have the knowledge to carry out this task.

Once these professionals analyze the data, they generate a PDF report, providing a clinical and biological interpretation of the information. These reports are so technical and specific that they can be difficult for non-specialist doctors to interpret. However, they contain very valuable data, as they are curated information about the patient.

Therefore, to begin understanding the population—whether there are relationships between variants, how age and habits influence health, etc.—it would be necessary to cross-reference this information. In practice, this would be very complicated because they were working with over 500 formats of data.

Genomic analysis process

Challenges  

“The information in the genomic reports comes in very heterogeneous formats. So, searching for something in them would require a lot of time and dedication. That’s where we found a solution together with our partner Datalytics. We decided to use generative models to extract information from the PDFs easily and transfer it to a standardized database,” explains Catalina Bustamante, head of technology at SURA’s Omics Science Center.

From this situation, the main challenges were:

• Processing large volumes of data.
Interoperability: it was necessary to combine genetic reports with medical history data.
• The existence of a large amount of unstructured but highly relevant information.
Difficulty in interpretation.

Strategy: What did we do?

The goal was to automatically extract information from more than 10,000 PDFs based on the variables we needed and compile it into a standardized report.

“Searching these databases requires technical knowledge of genetics, which is uncommon among clinical staff. However, they do have a clear understanding of what kind of questions to ask and the terms to use. In this sense, we designed a bot using generative AI that allows natural language queries to be made to the database,” adds Bustamante.

Together with the SURA team, Datalytics developed a solution based on a private instance of OpenAI, fed with information from patient genetic reports. Moving forward, we aim to integrate medical history, radiology, and other data.

It is a chat —private and restricted within Microsoft Teams— that uses generative AI to query data. Physicians interact using natural language, which facilitates access to and understanding of genomic data.

To achieve this:

  1. We standardized the data: We used a strategy powered by generative AI that normalizes the types of data received and helps machines interpret them. From there, genetic data could be queried just like any medical history.
  2. We used GPT to access the extracted, curated information available in a database. This allows natural language questions, such as: “What are the patient’s family medical histories?” GPT not only provides easy access to all available information but also summarizes it without adding potentially erroneous interpretations.

Physicians who are not specialists in genetics can ask questions naturally, such as: “How many patients have X variant and also suffer from X pathology?” or “How many patients have a parent younger than 40 with X disease?”

GPT is crucial because it spares geneticists from having to run queries manually, as it translates these questions into a query for the database. Once done, the AI agent constructs text that delivers the response to physicians in natural language.

GenAI-assisted queries

“Hunting through these databases requires technical knowledge of genetics, which is uncommon among clinical staff. However, they do know exactly what questions to ask and which terms to use. In this regard, we designed a bot using generative AI that allows natural language queries to the database,” Bustamante adds.

Results

Via chat, the healthcare personnel (geneticists and molecular biology professionals) can use natural language to ask all kinds of open-ended questions in multiple languages. They can also view an integrated natural language response that summarizes the findings and presents them alongside the associated documents.

Fredy Cuervo, a molecular biology professional at SURA’s Omics Science Center, said: “This agent allows us to analyze the genetic variants we encounter daily, enabling us to extract information quickly and accurately. It helps us identify which disease a variant is related to, its pathogenicity, and speeds up our analysis processes, allowing us to access information faster for research purposes.”

“This tool supports healthcare professionals and researchers in comparing information between patients and, most importantly, helps consolidate population data to produce research and create new services and products that contribute to people’s well-being,” concludes Carlos Andrés Agudelo, manager of SURA’s Biosciences department.

This content was originally published on Datalytics.com. Datalytics and Lovelytics merged in January 2025.

Author

Related Posts

Oct 01 2025

Accelerating Innovation: Philadelphia Union’s Data-Driven Journey to Dominance

Driven by Data, United for Victory In the high-stakes world of professional sports, every detail can make or break success. The Philadelphia Union, a formidable force...
Sep 30 2025

Customer Story: Locality Is Changing Local Advertising with Audience Intelligence

Scaling local advertising has always been hard. Fragmented workflows, rising costs, and limited ownership of audience data slowed progress. Locality has set out to...
Aug 20 2025

Enhancing Product and Retailer Taxonomy with Generative AI on the Databricks Data Intelligence Platform

The Evolving Role of AI in B2B E-Commerce Data is the backbone of B2B e-commerce, powering everything from seamless transactions to supply chain optimization. Yet, as...
Aug 04 2025

How Lovelytics and Databricks Partnered to Migrate and Automate Databricks’ Internal Reporting to AI/BI

Introduction: What is AI/BI and Why It’s a Game-Changer For years, BI tools have helped organizations analyze and visualize data, but the landscape has shifted....
Jul 28 2025

Accelerating Manufacturing Innovation at AdvanSix with Lovelytics and Databricks

In the manufacturing sector, achieving operational efficiency and maintaining high product quality are crucial priorities. Leading manufacturers are increasingly...
Mar 01 2025

Lovelytics and Databricks Deliver Data Innovation for $28B World-Renowned Healthcare Provider and Insurer

A world-renowned $28 billion healthcare provider at the forefront of research, treatment, and clinical care set out to modernize its health informatics and reporting...
Dec 03 2024

Revolutionizing Healthcare with Patient360 and Operational Efficiency: Lovelytics and Databricks Deliver Data Innovation for $8B Teaching Hospital and Research University

Amid the rapidly changing healthcare landscape, a renowned U.S. teaching hospital and biomedical research university that serves over 2.8 million patients annually...
Oct 28 2024

Transforming Data Analytics and Operational Efficiency at a Leading Airline: A Breakthrough Manufacturing Partnership with Lovelytics and Databricks

Airlines operate in a highly competitive environment, constantly balancing the need for profitability with safety, operational efficiency, and enhanced customer...
Oct 02 2024

Revenue Growth Management Unlocks Savings for CPG Leader

CPG leader implements revenue growth management platform to better access and leverage sales data. Project overview “Access to the information has driven greater focus...