Databricks Enterprise Security

Secure your big data and ML workflows with a unified approach to data security.

Watch Webinar >Download eBook >



Databricks Platform

Defense in Depth

Databricks employs a Defense in Depth security model to provide the most advanced protection for your data, AI and Apache SparkTM workflows at every layer.

Read blog on Platform Security >

Physical

AWS and Azure data centers are frequently audited and comply with a comprehensive set of frameworks including ISO 27001, SOC 1, SOC 2, SOC 3, PCI DSS.

Additionally, AWS and Azure physical data centers are located in non-disclosed locations and have stringent physical access controls in place to ensure that no unauthorized access is permitted, including biometric access controls and twenty-four hour security guards and video surveillance.


Infrastructure

  • Access Control: Control over inbound and outbound traffic leveraging security groups (AWS) and network security groups (Azure).
  • Logging and Monitoring: Comprehensive logging and monitoring for security events.


Host

  • Hardening: All hosts run a current release of Ubuntu (Data Plane) or CoreOS (Control Plane). Operating systems are hardened according to industry best practices.
  • Scanning: Hosts are scanned monthly for vulnerabilities.
  • Patching and Updates: Hosts are patched periodically for security updates and critical patch fixes.


Application

  • Secure System Development Life Cycle (SDLC): Adhere to security processes and checks that are an integral part of development.
  • Security QA and Penetration Testing: Rid the platform of security defects with rigorous security and pen testing.
  • Developer Security Training: Educate developers on security principals essential for their role.
  • Threat Modeling: Assess major risks to design and implement preventative security controls.
  • End User Security: Databricks provides the following capabilities natively in the platform:
  • Single Sign-On (SSO): Authenticate users with your existing provider using SAML 2.0. Databricks supports: Okta, Google for Work, OneLogin, Ping Identity, Microsoft Windows Active Directory.
  • Role-based Access Controls: Apply access control policies leveraging Databricks Cluster AWS IAM and Microsoft Active Directory roles, notebook ACL, workspace ACL, jobs ACL, cluster ACL, and library ACL.
  • Audit Logs: Provides insight into events within your deployment.


Data

  • Data Encryption: We use the latest version of TLS and strong encryption from AWS KMS and Azure Key Vault.
  • Access Controls: Fine-grained access control to notebooks, workspaces, jobs, and clusters.
  • Databricks Access: Automated control over Databricks access to customer data.
  • Data Governance: Customer data is persisted in designated AWS and Azure regions.
  • Backups: Automated scheduled backups of metadata and systems every 24 hours.
  • Retention and Deletion: Adherence to strict data retention policies in compliance customer requirements.

Click over each layer of our security model to learn more

Hover over each layer of our security model to learn more

Secure Platform with consistent Workflows

 

Many companies today operate on disjointed homegrown DIY (do-it-yourself) data and AI platforms. Databricks Unified Analytics Platform brings data engineering and data science teams together giving the data scientist the agility they want while providing data engineers a consistent, secure and reliable toolset with no patching or configuration issues.

  • Get managed security on a cloud-native platform
  • Leave your data and infrastructure in your cloud account with different data and control planes
  • Integrate easily with existing security processes

Read blog on Identity Federation >

Secure and Transparent Collaboration

 

Databricks lets both data engineering and data science teams work together in a single shared workspace. Databricks interactive notebooks contain runnable code, visualizations, narrative text and can be shared by multiple teams with commenting and versioning. Not only does this enhance collaboration and but is a single interface to control, track and audit access to data.

 

Learn how to securely access external data sources from Databricks for AWS > 

 

Ready to get started?

Get Started