Skip to content

ziednanaa/Fabric-SAP-Idocs

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

πŸš€ Real-Time SAP IDoc Data Product with Microsoft Fabric

Azure Microsoft Fabric Real-Time Intelligence Microsoft Purview

A complete end-to-end demonstration of a governed, real-time data product for 3PL logistics operations

🎯 What You'll Find Here

This repository contains a production-ready reference implementation demonstrating how to build a modern data product on Microsoft Fabric with:

βœ… Real-Time Data Product with Fabric Real-Time Intelligence (Eventhouse)
βœ… OneLake Security - Centralized Row-Level Security across all 6 Fabric engines
βœ… Microsoft Purview Unified Catalog - Complete data governance and quality monitoring
βœ… GraphQL API via Azure API Management with OAuth2 authentication
βœ… Live Demo Application - Visual interface showing partner-specific data filtering
βœ… SAP IDoc Simulator - Generate realistic 3PL logistics data


⚑ Quick Start

🎬 See the Demo in Action

# 1. Clone the repository
git clone https://github.com/flthibau/Fabric-SAP-Idocs.git
cd Fabric-SAP-Idocs

# 2. Launch the demo application
cd demo-app
.\start-demo.ps1

# 3. Get an access token (example with FedEx carrier)
.\get-token.example.ps1 -ServicePrincipal fedex

# 4. Open http://localhost:8000 and paste the token

πŸ“– Full Setup Guide: See demo-app/QUICKSTART.md


πŸ“‹ Business Scenario: 3PL Logistics with Row-Level Security

A manufacturing company outsources logistics to external partners (carriers, warehouses, customers) and needs to expose real-time operational data via API while ensuring each partner sees only their own data.

The Challenge

  • 3 Partner Types: Carriers (e.g., FedEx), Warehouse Partners (e.g., WH-EAST), Customers (e.g., ACME Corp)
  • 5 Data Entities: Orders, Shipments, Deliveries, Warehouse Movements, Invoices
  • Security Requirement: Partners must only see data they're authorized to access

The Solution

Real-Time Data Product powered by:

  • πŸ”₯ Microsoft Fabric Real-Time Intelligence (Eventhouse) for sub-second streaming
  • πŸ”’ OneLake Security for centralized Row-Level Security across 6 engines
  • πŸ“Š Microsoft Purview for data governance and quality monitoring
  • 🌐 GraphQL API exposed through Azure API Management

πŸ“„ Complete Business Case: demo-app/BUSINESS_SCENARIO.md


πŸ—οΈ Architecture Overview

SAP ERP System
      ↓
Azure Event Hubs (idoc-events)
      ↓
╔══════════════════════════════════════════════════════════╗
β•‘  MICROSOFT FABRIC REAL-TIME INTELLIGENCE                 β•‘
β•‘  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β•‘
β•‘  β”‚  Eventhouse (KQL Database)                      β”‚    β•‘
β•‘  β”‚  - Sub-second ingestion                         β”‚    β•‘
β•‘  β”‚  - Streaming transformations                    β”‚    β•‘
β•‘  β”‚  - Real-time analytics                          β”‚    β•‘
β•‘  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β•‘
β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•
      ↓
╔══════════════════════════════════════════════════════════╗
β•‘  MICROSOFT FABRIC LAKEHOUSE                              β•‘
β•‘  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β•‘
β•‘  β”‚  OneLake Storage (Delta Lake)                   β”‚    β•‘
β•‘  β”‚  - Bronze: Raw IDocs                            β”‚    β•‘
β•‘  β”‚  - Silver: Normalized tables                    β”‚    β•‘
β•‘  β”‚  - Gold: Business views (materialized)          β”‚    β•‘
β•‘  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β•‘
β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•
      ↓
╔══════════════════════════════════════════════════════════╗
β•‘  ONELAKE SECURITY LAYER (Centralized RLS)                β•‘
β•‘  βœ“ Real-Time Intelligence (KQL)                          β•‘
β•‘  βœ“ Data Engineering (Spark)                              β•‘
β•‘  βœ“ Data Warehouse (SQL)                                  β•‘
β•‘  βœ“ Power BI (Direct Lake)                                β•‘
β•‘  βœ“ GraphQL API (THIS PROJECT)                            β•‘
β•‘  βœ“ OneLake API                                           β•‘
β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•
      ↓
Fabric GraphQL API (partner_logistics_api)
      ↓
Azure API Management (apim-3pl-flt)
   - OAuth2 validation
   - CORS policy
   - Rate limiting
      ↓
╔══════════════════════════════════════════════════════════╗
β•‘  MICROSOFT PURVIEW UNIFIED CATALOG                       β•‘
β•‘  - Data Product registration                             β•‘
β•‘  - Data quality monitoring                               β•‘
β•‘  - Lineage tracking                                      β•‘
β•‘  - Business glossary                                     β•‘
β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•
      ↓
Partner Applications
   - FedEx Carrier Portal
   - Warehouse WH-EAST Dashboard
   - ACME Corp Customer Portal

πŸ“„ Technical Architecture: demo-app/API_TECHNICAL_SETUP.md


🎯 Project Overview

This project demonstrates how to build a governed, real-time data product on Microsoft Fabric with enterprise-grade security and quality controls. It showcases the complete journey from SAP IDoc ingestion to partner API consumption.


πŸ“¦ Repository Contents

🎬 Demo Application (/demo-app)

Live demonstration of Row-Level Security in action

  • Visual Interface: 4-tab application showing partner-specific data filtering
  • OAuth2 Authentication: Service Principal token acquisition scripts
  • Documentation:

Technologies: HTML5, JavaScript, Python HTTP Server

🎲 SAP IDoc Simulator (/simulator)

Generate realistic 3PL logistics data

  • 5 IDoc Types: ORDERS, SHPMNT, DESADV, WHSCON, INVOIC
  • Configurable Scenarios: Warehouse count, customer count, carrier count
  • Azure Event Hubs Integration: Direct ingestion to Fabric Eventstream
  • Documentation: Complete setup and usage guide

Technologies: Python 3.11+, Azure SDK, YAML configuration

cd simulator
python main.py --count 100  # Generate 100 IDocs

🏭 Fabric Configuration (/fabric)

Microsoft Fabric workspace setup and data transformations

  • Eventstream: Real-Time Intelligence ingestion configuration
  • Data Engineering: Spark notebooks for Bronze/Silver/Gold layers
  • Warehouse: SQL schemas and materialized views
  • OneLake Security: Row-Level Security configuration guides

Key Files:

  • warehouse/security/ONELAKE_RLS_CONFIGURATION_GUIDE.md - Complete RLS setup
  • data-engineering/notebooks/gold_layer_orders_summary.py - Gold layer transformations

🌐 API Implementation (/api)

GraphQL and APIM configuration

  • GraphQL Schema: Partner-filtered data access (partner-api.graphql)
  • APIM Policies:
    • CORS configuration
    • OAuth2 validation
    • Rate limiting
    • GraphQL passthrough
  • PowerShell Scripts: Service Principal setup, APIM deployment, testing

Endpoints:

  • GraphQL: https://apim-3pl-flt.azure-api.net/graphql
  • REST (auto-generated): /rest/shipments, /rest/orders, etc.

πŸ›‘οΈ Governance & Quality (/governance)

Microsoft Purview integration

  • Data Product Registration: Purview catalog integration
  • Data Quality Rules: Automated quality monitoring
  • KQL Queries: Quality validation dashboards
  • Python Scripts: Quality rule deployment

Technologies: Microsoft Purview, KQL, Python

πŸ—οΈ Infrastructure (/infrastructure)

Infrastructure as Code templates

  • Bicep Templates: Azure resource deployment
  • PowerShell Scripts: APIM setup, Service Principal creation, REST API deployment
  • Configuration Files: Resource definitions and policies

πŸ› οΈ Technology Stack

Layer Technology Purpose
Ingestion Azure Event Hubs SAP IDoc streaming
Real-Time Processing Fabric Real-Time Intelligence (Eventhouse) Sub-second analytics
Storage OneLake (Delta Lake) Unified data lake
Transformation Fabric Data Engineering (Spark) ETL pipelines
Analytics Fabric Data Warehouse (SQL) TSQL queries
API Fabric GraphQL + Azure APIM Data product exposure
Security OneLake Security + Azure AD Centralized RLS
Governance Microsoft Purview Data catalog & quality
BI Power BI Direct Lake Real-time dashboards

πŸ”’ Security Architecture

OneLake Security: Single Point of Control

Storage-Layer Row-Level Security enforced across all 6 Fabric engines:

-- Example RLS rule applied at OneLake storage layer
CREATE FUNCTION dbo.PartnerSecurityPredicate(@partner_id NVARCHAR(50))
RETURNS TABLE
WITH SCHEMABINDING
AS RETURN (
    SELECT 1 AS AccessGranted
    WHERE @partner_id = CAST(SESSION_CONTEXT(N'PartnerID') AS NVARCHAR(50))
)

Benefits: βœ… Centralized: One RLS definition, enforced everywhere
βœ… Multi-Engine: Works across KQL, Spark, SQL, Power BI, GraphQL, OneLake API
βœ… Identity-Aware: Leverages Azure AD Service Principal claims
βœ… Impossible to Bypass: Enforced at storage layer, not application layer

Authentication Flow

  1. Partner Application β†’ Acquires OAuth2 token from Azure AD
  2. Token Claims β†’ Include Service Principal ObjectId
  3. APIM Gateway β†’ Validates token, extracts claims
  4. GraphQL API β†’ Sets session context with partner identity
  5. OneLake Security β†’ Filters data based on RLS rules
  6. Partner Receives β†’ Only authorized data

πŸ“„ Complete Security Guide: fabric/warehouse/security/ONELAKE_RLS_CONFIGURATION_GUIDE.md


πŸ“Š Data Governance with Microsoft Purview

Unified Catalog Integration

Data Product Registration:

  • Product Name: SAP-3PL-Logistics-Real-Time-Product
  • Domain: Logistics & Supply Chain
  • Owner: Data Product Team
  • SLA: < 5 minutes latency, 99.9% availability

Data Quality Monitoring (6 dimensions):

  1. Completeness: Required fields populated
  2. Accuracy: Valid reference data
  3. Consistency: Cross-entity relationships maintained
  4. Timeliness: Data freshness SLA compliance
  5. Validity: Format and range validations
  6. Uniqueness: No duplicate key violations

Automated Quality Checks:

// Example quality check running in Purview
idoc_shipments_gold
| summarize 
    TotalRows = count(),
    MissingCarrier = countif(isempty(carrier_id)),
    FutureDates = countif(ship_date > now())
| extend 
    CompletenessScore = 100.0 * (1 - todouble(MissingCarrier) / TotalRows),
    ValidityScore = 100.0 * (1 - todouble(FutureDates) / TotalRows)

πŸ“„ Governance Setup: governance/PURVIEW_DATA_QUALITY_SETUP.md


πŸ“ˆ Real-Time Intelligence Capabilities

Eventhouse (KQL Database)

Sub-Second Streaming Analytics:

  • Ingestion latency: < 1 second
  • Query performance: Sub-second for aggregations
  • Retention: Configurable (hot/cold tiers)

Example KQL Queries:

// Real-time shipment tracking
idoc_shipments_raw
| where ingestion_time() > ago(5m)
| where carrier_id == "FEDEX"
| summarize 
    ShipmentCount = count(),
    TotalWeight = sum(weight_kg)
  by bin(ship_date, 1h)
| render timechart

Use Cases:

  • Live operational dashboards
  • Real-time alerting
  • Streaming anomaly detection
  • Interactive exploration

πŸš€ Getting Started

Prerequisites

  • Azure Subscription with Microsoft Fabric enabled
  • Azure AD tenant with permission to create Service Principals
  • Fabric Workspace with appropriate permissions
  • PowerShell 7+ or Azure CLI
  • Python 3.11+ (for simulator)

Step-by-Step Setup

1️⃣ Deploy Azure Infrastructure

cd infrastructure/bicep

# Deploy Event Hub
az deployment group create \
  --resource-group rg-fabric-sap-idocs \
  --template-file event-hub.bicep

# Deploy APIM (if not existing)
az deployment group create \
  --resource-group rg-fabric-sap-idocs \
  --template-file apim.bicep

2️⃣ Create Service Principals

cd api/scripts

# Create 3 Service Principals (FedEx, Warehouse, ACME)
.\create-partner-apps.ps1

# Grant Fabric workspace access
.\grant-sp-workspace-access.ps1

3️⃣ Configure Microsoft Fabric

Eventstream:

  1. Create Eventstream: idoc-ingestion-stream
  2. Source: Azure Event Hubs (eh-idoc-flt8076/idoc-events)
  3. Destination: Eventhouse kql-3pl-logistics

Lakehouse:

  1. Create Lakehouse: lakehouse_3pl
  2. Run Bronze/Silver/Gold transformation notebooks
  3. Create materialized views

OneLake Security:

-- Run in Fabric Warehouse
-- See fabric/warehouse/security/ONELAKE_RLS_CONFIGURATION_GUIDE.md
CREATE SECURITY POLICY PartnerAccessPolicy
ADD FILTER PREDICATE dbo.PartnerSecurityPredicate(partner_id)
ON gold.orders, gold.shipments, gold.invoices
WITH (STATE = ON);

4️⃣ Deploy GraphQL API

cd fabric/scripts

# Enable GraphQL on Lakehouse
.\enable-graphql-api.ps1

# Deploy API definition
.\deploy-graphql-api.ps1

5️⃣ Configure APIM

cd api/scripts

# Deploy APIM policies (CORS, OAuth, etc.)
.\configure-and-test-apim.ps1

# Test REST API endpoints
.\test-rest-apis.ps1

6️⃣ Run the Demo Application

cd demo-app

# Copy example file and add your secrets
Copy-Item get-token.example.ps1 get-token.ps1
# Edit get-token.ps1 with real Service Principal secrets

# Start demo server
.\start-demo.ps1

# In another terminal, get a token
.\get-token.ps1 -ServicePrincipal fedex

# Open http://localhost:8000 and paste token

7️⃣ (Optional) Setup Purview Governance

cd governance/purview

# Register data product in Purview
python create_data_quality_rules.py

# Deploy quality monitoring dashboard
# Upload data_quality_monitoring_dashboard.kql to Purview

πŸ“„ Detailed Setup Guides: See individual README files in each folder


πŸ§ͺ Testing the Solution

End-to-End Flow Test

# 1. Generate test IDocs
cd simulator
python main.py --count 50

# 2. Verify Eventstream ingestion
# Check Fabric Eventstream monitoring

# 3. Query Eventhouse (Real-Time Intelligence)
# Run KQL query in Eventhouse portal

# 4. Test GraphQL API
cd ../api/scripts
.\test-graphql-rls.ps1

# 5. Test REST API via APIM
.\test-rest-apis.ps1

# 6. Verify RLS filtering
.\test-fedex-only.ps1  # Should only see FedEx shipments

Data Quality Validation

// Run in Eventhouse or Purview
idoc_shipments_gold
| extend QualityCheck = case(
    isempty(carrier_id), "Missing Carrier",
    isempty(tracking_number), "Missing Tracking",
    weight_kg <= 0, "Invalid Weight",
    "OK"
)
| summarize count() by QualityCheck

πŸ“š Documentation Index

Business & Architecture

Setup Guides

API Reference

Governance


🎯 Key Takeaways

πŸ”₯ Real-Time Intelligence

  • Sub-second latency from SAP to API using Eventhouse
  • Streaming analytics with KQL for operational insights
  • Hot path for live dashboards and alerting

πŸ”’ OneLake Security

  • Single RLS definition enforced across 6 Fabric engines
  • Storage-layer security impossible to bypass
  • Identity-aware filtering via Azure AD integration

πŸ“Š Data Governance

  • Purview Unified Catalog for data product registration
  • Automated quality monitoring with 6 quality dimensions
  • Full lineage tracking from SAP to API

🌐 Modern Data Product

  • GraphQL-first API for flexible data access
  • APIM gateway for enterprise-grade API management
  • OAuth2 authentication with Service Principal claims

🎬 Production-Ready Demo

  • Live application showing real RLS filtering
  • Complete documentation for LinkedIn sharing
  • Infrastructure as Code for repeatable deployment

🀝 Contributing

Contributions are welcome! Please follow these guidelines:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

πŸ“„ License

This project is provided as-is for educational and demonstration purposes.


πŸ‘₯ Author

Florent Thibault
Microsoft - Data & AI Specialist

πŸ“§ Contact: [Your Contact Info]
πŸ”— LinkedIn: [Your LinkedIn]


🌟 Acknowledgments

  • Microsoft Fabric Team - Real-Time Intelligence capabilities
  • Azure APIM Team - GraphQL support
  • Microsoft Purview Team - Data governance platform
  • Community Contributors - Testing and feedback

πŸ“Œ Version History

  • v1.0.0 (October 2025)
    • βœ… Complete demo application with RLS
    • βœ… Real-Time Intelligence integration
    • βœ… OneLake Security implementation
    • βœ… Purview Unified Catalog integration
    • βœ… GraphQL API via APIM
    • βœ… SAP IDoc simulator
    • βœ… Comprehensive documentation

⭐ If you find this project useful, please star the repository!

πŸ”— Repository: https://github.com/flthibau/Fabric-SAP-Idocs

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 54.5%
  • PowerShell 38.1%
  • JavaScript 2.9%
  • HTML 2.3%
  • TSQL 1.4%
  • CSS 0.8%