logo

Are you need IT Support Engineer? Free Consultant

Need to Know – Beyond SAP Analytics Cloud AI and U…

  • By sujay
  • 11/06/2026
  • 28 Views

The Data and Analytics Landscape Is Changing — Are You Ready?

For years, SAP customers have faced a familiar challenge: rich, high-quality business data locked inside ERP and line-of-business systems, difficult to extract, expensive to replicate, and often arriving too late to drive meaningful decisions. The emergence of AI and machine learning has only sharpened that challenge. Building a reliable AI model demands not just data volume, but data quality, business context, and governance — the very things that traditional ETL-and-copy approaches erode.

This blog is a follow on from my previous blog Need to Know – Unlocking the Power of AI in SAP Analytics Cloud explaining how you can extend AI, machine learning and predictive from SAP Analytics Cloud using SAP Databricks that is part of SAP Business Data Cloud.

Here is a summary of what each section covers:

Part 1 — SAP Business Data Cloud: Explains what BDC is.

Part 2 — What Is Databricks: Covers what Databricks is.

Part 3 — SAP Databricks in BDC: Explains the embedded, SAP-managed nature of the integration and why it matters when creating machine learning or predictive models.

Part 4 — Why Databricks Beyond SAC: Where Databricks helps beyond the AI, machine learning and predictive inside SAP Analytics Cloud.

Part 5 — Real-World Use Cases for SAP Databricks in BDC: Some example use cases summarised.

Part 6 — SAP Databricks and BDC Example A walkthrough from subscribing to the BDC Intelligent Application, through Delta Sharing, model training in SAP Databricks, sharing results back to BDC, and surfacing in the results using Datasphere inside SAC reports.

 

Part 1 – What Is SAP Business Data Cloud?

Launched in early 2025, SAP Business Data Cloud (BDC) is SAP's fully managed, cloud-native SaaS platform for data and analytics. It is not simply a rebranding of existing tools — it is a fundamental rearchitecting of how SAP customers unify, govern, and activate their data.

At its core, BDC brings together the capabilities that SAP customers have historically relied on — such as SAP Datasphere and SAP Analytics Cloud— under a single, cohesive modern data platform.

BDC establishes what SAP calls a Business Data Fabric — an architecture that connects data from SAP S/4HANA, SuccessFactors, Ariba, and other line-of-business applications, as well as third-party sources through a set of key capabilities:

  • Data Products: Pre-built, curated datasets delivered by SAP that carry full business semantics and context. These are not raw tables — they are governed, business-ready data assets covering domains such as finance, supply chain, and HR. They are accessed via open protocols including Delta Sharing, meaning data remains in place and is shared securely without duplication.
  • Foundation Services: The behind-the-scenes engine that handles data ingestion, transformation, and orchestration. Foundation Services use Open Resource Discovery (ORD) to catalogue and expose Data Products across the platform, storing data in HANA Data Lake Files using the Delta Lake format.
  • Intelligent Applications: Pre-built, fully SAP-managed analytical applications — covering Finance Intelligence, People Intelligence, Supply Chain Intelligence, and Working Capital — that are ready to consume with a single-click installation. They are built on top of Data Products, Datasphere models, and SAP Analytics Cloud stories, and include pre-defined metrics, models, and stories/applications to use the data immediately.  Unlike traditional SAP business content, Intelligent Applications are lifecycle-managed by SAP, including data integration, updates, and ongoing support.
  • SAP Analytics Cloud (SAC): The presentation and analytics layer within BDC, supporting business intelligence, planning, and scenario simulation. SAC connects live to Datasphere models and surfaces insights to consumers immediately.
  • SAP Datasphere: The semantic modelling and data integration hub within BDC. Datasphere hosts custom data models, manages connections to on-premise and third-party sources, and provides the analytical layer that SAC consumes.
  • SAP BW Private Cloud Edition PCE: For organisations with existing BW investments, BDC supports lifting on-premise BW  systems to a Private Cloud Edition managed by SAP, enabling continuity while unlocking access and modernising BW data using the BW Data Product Generator into Business Data Cloud Data Products.
  • BDC Connect: A secure, zero-copy data sharing capability that allows SAP and partner systems to exchange data products in real time, without building or maintaining ETL pipelines.

 

There are some excellent blogs out there that already cover SAP Business Data Cloud in more detail including this excellent one here https://community.sap.com/t5/technology-blog-posts-by-sap/sap-business-data-cloud-faqs/ba-p/14022781

The point of this blog is not to explain SAP Business Data Cloud again, but to explore how SAP Business Data Cloud, combined with SAP Databricks, can help deliver further AI, machine learning and predictive use cases using a data platform of governed, semantically rich, and AI-ready data.

 

Part 2- What Is Databricks, and Why Does It Matter?

Databricks is an independent data and AI company founded in 2013 by the creators of Apache Spark — the open-source distributed computing engine that underpins much of the modern big data ecosystem. Since then, Databricks has grown into one of the most widely adopted platforms for data engineering, data science, and machine learning at enterprise scale.

What does Databricks give organisations?

  • Apache Spark at scale: Distributed data processing capable of handling petabyte-scale datasets efficiently, enabling data engineering teams to build robust ingestion, transformation, and enrichment pipelines.
  • Collaborative notebooks: Data scientists and engineers work in shared Python, Scala, R, or SQL notebooks — enabling rapid experimentation, reproducibility, and team collaboration. Spark pipelines, SQL analysis, and ML model development all happen in the same environment.
  • MLflow: Databricks' open-source ML lifecycle management tool tracks experiments, manages model versions, and facilitates model deployment. It gives organisations full visibility into how models were trained, what data they used, and how they perform over time.
  • Mosaic AI (formerly Databricks ML): A comprehensive suite of AI and ML capabilities including AutoML for automated model training, Feature Store for curating and sharing ML features across teams, Model Serving for real-time and batch inference endpoints, and Vector Search for retrieval-augmented generation (RAG) use cases.
  • Unity Catalog: Databricks' unified governance layer for data and AI assets — covering tables, models, notebooks, and Delta Sharing recipients — with fine-grained access controls and full lineage tracking across the entire data and AI lifecycle.
  • Delta Sharing: An open protocol for sharing live data across organisational and platform boundaries without copying it. Data remains in the source location; recipients query it securely via a token-based credential system.

Organisations across industries — financial services, manufacturing, retail, life sciences — use Databricks because it resolves a problem that few other platforms address simultaneously: enabling data engineers, data scientists, business analysts, and ML engineers to work together on the same governed data, in the same environment, without the friction of hand-offs and data movement.  Read more about Databricks here https://www.databricks.com/

 

Part 3 – SAP Databricks Inside SAP Business Data Cloud

Through a landmark partnership announced alongside the launch of BDC, SAP has embedded Databricks directly into Business Data Cloud as a fully SAP-managed component. This means that when you subscribe to BDC and enable SAP Databricks, you are not integrating an external tool — you are activating a native capability within the same governed environment.

Using SAP Databricks with SAP data, enables data scientists and engineers to gain access to the Databricks pro-code environment while secure in the knowledge that they have access to secure, governed and semantically rich SAP business data.

Key capabilities of SAP Databricks inside BDC:

  • Zero-copy data access to SAP-managed Data Products via Delta Sharing — no pipelines, no replication
  • Pro-code development environment: Python notebooks, Spark pipelines, SQL analysis, and ML model training
  • Feature Store: Curate and share ML features across teams and use cases
  • Model training and validation: Experiment tracking via MLflow with full lineage and reproducibility
  • Unity Catalog: Govern all data and model assets with fine-grained access control and lineage
  • Delta Share back to BDC: Publish inference results as derived Data Products for consumption in Datasphere and SAC

While SAP packages Databricks inside BDC in the form of SAP Databricks it’s worth pointing out that organisations with an existing implementation of enterprise Databricks can use this with BDC instead of using SAP Databricks.  For more information about the SAP Databricks components above then have a look at this excellent blog https://community.sap.com/t5/technology-blog-posts-by-members/sap-databricks-is-now-ga-get-the-most-…

 

Zero-Copy Delta Sharing: The Foundation of Everything

The most architecturally significant aspect of SAP Databricks in BDC is how data flows between the Business Data Fabric and the Databricks environment. Rather than exporting, copying, or replicating data, BDC uses Delta Sharing — the open protocol co-developed by Databricks — to share Data Products directly with the SAP Databricks environment without any duplication.

Why does this matter?

When organisations extract data from SAP systems and rebuild it in a separate analytics or AI platform, they inevitably lose something: the business context and semantics that SAP has carefully maintained — the meaning of a cost centre, the hierarchy of a material group, the relationship between an invoice and a payment term. By the time data engineers have rebuilt that context in a new environment, it is often incomplete, stale, or inconsistent.

Delta Sharing eliminates this problem. SAP Data Products — already curated, governed, and carrying full business semantics — are shared directly to SAP Databricks via the Delta Sharing protocol. The data scientist in Databricks is not working with an export; they are working with the live, governed, semantically annotated dataset that SAP manages. No pipelines to build. No copies to reconcile. No semantic context to reconstruct.

 

What this means for AI and machine learning

The quality of an AI or ML model is a direct function of the quality and richness of its training data. When data arrives in a modelling environment already carrying business semantics — accounts payable ageing, material lead times, workforce attrition signals, customer payment patterns — the data scientist spends less time cleaning and engineering features, and more time building models that actually reflect the business problem.

The bidirectional nature of the integration compounds this value. After a data scientist builds and deploys a predictive model in SAP Databricks, the inference results — cash flow forecasts, churn scores, demand signals — can be Delta Shared back to BDC, where they become derived Data Products. Those derived Data Products are then available for consumption in SAP Datasphere analytical models and SAP Analytics Cloud dashboards. The AI output is not stranded in a separate tool; it flows back into the business fabric where decision-makers can act on it.

 

Part 4- Why Databricks — Going Beyond What SAP Analytics Cloud Already Offers

This is a question that comes up consistently in customer conversations: SAP Analytics Cloud already includes AI, Machine Learning and Predictive functionality in the form of Smart Predict, Smart Insights, Predictive Planning and more that allow business users to build predictive models, identify drivers, and generate automated forecasts without writing a line of code.

So why would you need SAP Databricks as well?

The answer lies in the nature of the problem being solved, the scale of data involved, and the profile of the person doing the work.

SAP Analytics Cloud enables end users to use predictive functionality embedded inside the tool and inside the process intended for this – i.e. Predictive Planning for forecasting is a good example where the functionality enables the planner to produce forecasts without having to be a data scientist.  Also, SAC allows you to create more sophisticated predictive models in the area of Smart Predict.  Smart Predict is designed for business analysts who need to build a regression or classification model on structured, moderate-scale datasets — think payment likelihood scoring on a customer portfolio. It is low-code, business-user-friendly, and tightly integrated with SAC stories and planning workflows. It is excellent at what it does.

However, it was not built for the following scenarios:

  1. Large-scale, complex feature engineering Enterprise AI use cases often require combining dozens of data sources — ERP transactional data, external market signals, weather feeds, social indicators, third-party risk ratings — and engineering hundreds of features from them. SAC Smart Predict operates on pre-built datasets; SAP Databricks can access SAP data and more external non-SAP data used to enhance the features required in these models.
  2. Custom model architectures SAC Smart Predict offers a curated set of algorithms (classification, regression, time series) with limited hyperparameter control. Data scientists working on complex problems need the freedom to build, experiment, and validate custom architectures. SAP Databricks provides that freedom via Python, TensorFlow, PyTorch, XGBoost, and the full Mosaic AI suite.
  3. Real-time inference at scale SAP Databricks Model Serving deploys trained models that can be called in real time by transactional applications, planning systems, or operational dashboards. SAC Smart Predict only operates within the SAC environment.
  4. Operationalised ML with full lineage Running ML in production is not just about training a model — it is about managing drift, scheduling retraining, monitoring data quality, tracking lineage from raw data to prediction, and maintaining audit trails. SAP Databricks, with MLflow and Unity Catalog, provides a mature MLOps framework for this. SAC Smart Predict does not provide equivalent MLOps tooling.

In summary: SAP Analytics Cloud and its Smart capabilities remain the right choice for business analysts using embedded AI functionality and for building governed, business-user-accessible predictive models within SAC. SAP Databricks is the right environment for data scientists and data engineers building custom, complex, large-scale AI and ML solutions that need to operate at the speed and sophistication that modern enterprise AI demands. The two are complementary, not competing — and BDC brings them together in a single governed platform.

 

Part 5 – Real-World Use Cases for SAP Databricks in BDC

The combination of governed SAP data and Databricks' AI/ML capabilities opens up a wide range of use cases across every line of business. Here are a couple of examples:

Accounts Receivable Collections Optimisation (Finance/Credit)

Using AR ageing data products from SAP S/4HANA, combined with customer financial health signals and historical payment behaviour, SAP Databricks models can predict which invoices are at risk of late or non-payment with high precision. Collections teams can be automatically prioritised towards the highest-risk, highest-value accounts, reducing DSO (Days Sales Outstanding) and improving working capital without increasing headcount.

Workforce Composition Analysis and Talent Intelligence (CHRO)

HR leaders face mounting pressure around talent acquisition costs, skills shortages, and workforce engagement in a volatile labour market. Understanding the composition, capability, and trajectory of a workforce requires more than headcount reports.

BDC integrates with SAP SuccessFactors, providing zero-copy access to Data Products covering employee profiles, skills inventories, role descriptions, performance metrics, and open requisitions. SAP Databricks blends this internal HR data with external labour market signals — industry salary benchmarks, skills-demand trends, competitor hiring patterns — to build predictive models for attrition risk, time-to-hire, and skills gap analysis.

These models enable CHROs to identify at-risk employees before they leave, prioritise upskilling investments based on projected skills demand, and build data-driven talent acquisition strategies. Results feed back into SAC gives HR business partners and line managers actionable, model-driven workforce insights.

 

Part 6 –  SAP Business Data Cloud and SAP Databricks — A Step-by-Step Walkthrough

Let's take the Accounts Receivable Collections Optimisation (Finance/Credit) use case from above and using SAP data with SAP Databricks predict which invoices are at risk of late or non-payment.

Step 1: Subscribe to the BDC Intelligent Application

In the BDC Cockpit, the organisation subscribes to the BDC Intelligent Application package. This triggers the automatic provisioning of SAP-managed Data Products from the connected SAP S/4HANA system, including:

    • Accounts Receivable Ageing — open invoices by customer, due date, payment terms, and historical payment behaviour
    • Accounts Payable Obligations — open vendor invoices, payment run schedules, and outstanding commitments
    • General Ledger Cash Positions — current bank balances, intercompany positions, and cash pool structures
    • Sales Order Pipeline — open orders not yet invoiced, with expected delivery and billing dates

     

    The above screenshot shows the BDC Cockpit and an example of an Intelligent Application about to be installed.

    These Data Products are delivered with full SAP business semantics — cost centres, company codes, currencies, fiscal calendars — maintained and governed by SAP. No custom extraction or data pipeline is required.

    Step 2: Initiate Delta Sharing to SAP Databricks

    From the BDC Data Product Catalog, the data engineer selects the relevant Finance Data Products and initiates a Delta Sharing connection to the SAP Databricks environment. This creates a Databricks catalog that reflects the live BDC Data Products — the data is not copied; the Delta Sharing protocol makes the tables available directly within the Databricks Unity Catalog as external shares.

     

    Deanfarrow_1-1781180014889.Png

     

    The above screenshot shows sharing a Data Product using the Delta Sharing functionality.

    The data engineer provisions SAP Databricks as the Delta Sharing recipient. From this point, any updates to the underlying SAP data products — new invoices posted, payments cleared, new sales orders raised — are immediately available to the Databricks environment without any pipeline or batch job.

     

    Step 3: Using the SAP Data to Train, Validate and running the Cash Forecasting Model

    In a Databricks Notebook, the data engineer writes a Python pipeline to ingest and blend data from SAP Data Products and from external data, such as

    • Foreign exchange rate feeds (from a financial data provider API) to normalise multi-currency receivables into group currency
    • Customer credit scores and payment behaviour data (from a credit bureau) to adjust expected payment dates
    • Macroeconomic indicators — central bank interest rate decisions, PMI indices — to model the impact of economic conditions on customer payment cycles

    Deanfarrow_2-1781180014895.Png

     

    The above screenshot shows an example notebook accessing the shared SAP data  product for the predictive model.

    The blended dataset is registered in the Databricks Feature Store — creating reusable, versioned features (e.g., customer_30d_payment_velocityweighted_ar_collection_probabilityfx_adjusted_payable_balance) that can be shared across multiple models and teams.

    The data scientist uses a Databricks Notebook and begins model development:

    1. Exploratory analysis: Using Databricks SQL, the data scientist queries the Feature Store to understand the distribution of payment timing, identify seasonal patterns in cash inflows, and detect anomalies in the AR ageing profile.
    2. Feature engineering: Time-series features are constructed — rolling averages of collection rates by customer segment, lagged payment indicators, treasury balance trends.
    3. Model training: Multiple forecasting algorithms are evaluated using MLflow experiment tracking
    4. Model validation: Walk-forward cross-validation across the last 24 months of historical data. 
    5. Model registration: The best-performing model is registered in the Unity Catalog Model Registry with full lineage linking it to the Feature Store features and training Data Products it was built on.

     

    Step 4: Share Results Back to BDC

    The inference results — the predicted days (late or on-time) invoices will be paid — are Delta Shared back to SAP Business Data Cloud, where they become a derived Data Product in the BDC catalog.

     

    Deanfarrow_3-1781180014902.Png

    The above screenshot shows the results data being shared inside SAP Databricks with SAP BDC.

     

    In SAP Datasphere, a data analyst creates an Analytical Model that joins the Databricks predictions Data Product with live SAP general ledger positions and more.

     

    Deanfarrow_4-1781180014906.Png

    The above screenshot shows the data flow used to create the Analytical Model which includes the predictions (delay_prediction_dataset) from SAP Databricks joined with data from S/4HANA, SAP Business Warehouse and Google BigQuery.

     

    In SAP Analytics Cloud, a finance analyst builds a dashboard that displays:

      • Predicted overdue amounts
      • Predicted overdue amounts by days late
      • Predicted overdue amounts per customer

      Deanfarrow_5-1781180014911.Png

      The above screenshot shows an example dashboard showing the results from the predictive model sliced by different dimensions.

       

      The Business Outcome

      What was previously a three-day manual process becomes a continuously updated, AI-driven forecast that:

      • Reduces forecast preparation time from days to minutes
      • Improves forecast accuracy by incorporating external signals and machine learning to predict cashflow
      • Enables proactive decision-making — treasury can initiate early payment discount programmes, adjust working capital facilities, or hedge FX exposure based on predicted signals, not historical actuals
      • Maintains full SAP governance — the underlying data never left the SAP-governed environment, and every model prediction is traceable back to its source data

       

      SAP Business Data Cloud represents a significant step forward in how organisations can govern and activate their most valuable data asset: the operational data inside SAP systems. The integration of SAP Databricks takes that foundation and extends it into the domain of enterprise AI — not as a separate project, but as a native, governed, fully integrated capability.

      The combination resolves a tension that has held back SAP AI adoption for years: the need to choose between the richness of SAP data and the power of open, pro-code AI tooling. With SAP Databricks in BDC, you no longer have to choose. The data stays governed, the semantics stay intact, and your data scientists get the best AI/ML platform in the market to build on top of it.

      If your organisation is beginning to explore what AI can do with your SAP data — or if you have been frustrated by the gap between your data quality ambitions and your current tooling — SAP Business Data Cloud with SAP Databricks is worth serious evaluation.

Source link

Leave a Reply

Your email address will not be published. Required fields are marked *