specs/overview/S_DATA

Data Management

Infrastructure

Tulip App Servers: Kubernetes + Kubernetes handles scale-out of application servers and services to thousands of nodes. Used in production by Box, SAP, New York Times, eBay, Comcast, IBM, and more.

Databases

The Tulip backend uses three major technologies for persisting data:

  1. Read Only Process Completion Data: PostgreSQL

    • Postgres scales to tens of terabytes of data. Our read-heavy analytics workload supports scale-out via read replicas.
  2. Digital Factory digital data (users, stations, apps, etc): Mongo DB

    • MongoDB scales to terabytes of data and billions of documents. Per-customer databases allows for easy sharding to distribute load and scale out the cluster.
  3. Machine Outputs: Amazon S3

Data Integrity Control

Data inside the database is not changeable unless via the authorised controls inside the Tulip system. There are four main components to ensure application security:

SSL

  • 2048-bit RSA key
  • SHA384 signature
  • Yearly rotation
  • Forward-secrecy (ECDHE) preferred
  • Outdated cipher suites (SSL 2/3) forbidden
  • Qualsys SSLLabs A+ score

Database Security

  • Randomly-generated authentication keys
  • Colocated in AWS, only local traffic
  • Dedicated read-only accounts for analytics
  • Parameterization to avoid injection
  • Encryption-at-rest and encryption-in-transit

Web Security Standards

  • HSTS to prevent SSL-stripping MITM attacks
  • DOM Templating and CSP to prevent XSS
  • LocalStorage instead of cookies
  • X-Frame-Options to prevent clickjacking

Application Security

  • All web server endpoints verify ACL
  • Enforced with static analysis and code review
  • Password hashing: SHA512 client-side
  • Password hashing: bcrypt + per-user salt server-side
  • Password entropy estimation and minimums
  • Long, random keys for cells and tablets
  • Automated security updates
  • All production code reviewed by multiple engineers

System Data Creation and Management

Data is created and controlled in the following ways:

Data ElementCreated In/byStored InControls
User DataTulip EditorMongo1. Access controlled (authorised users)
2. System controls changes by design
Apps (inc. history and groups)Tulip EditorMongo1. Access controlled (authorised users)
2. Permissions
3. System controls versions
App VariablesTulip EditorMongo1. Variables are defined and stored in mongo, and utilized in the execution data saved in postgres
2. Each variable has a unique identifier, which cannot be changed
3. Records in Postgres are read-only values attached to that variable identifier
Stations (inc groups)Tulip EditorMongoAccess controlled (authorised users)
MachinesTulip EditorMongoAccess controlled (authorised users)
Activity LogTulip EditorMongo
System Date & TimeServerAzureCannot be changed in the Editor or Player
System & Station TimezoneTulip EditorMongoAccess controlled (authorised users)
App Executions
Tulip Player continually updates the execution data locally, and on the server using Mongo DB’s underlying technology for updating collections across client and server. In the Tulip code this module is called the “state manager”. When a ‘process completion’ event is sent from the player to the server, it writes the current status of ‘state manager’ into a new row in the postgres db.
Postgres1. Access controlled (authorised users)
2. System controls changes by design
3. Read-only DB
4. Every method on the back end does an ACL check to make sure the logged in user has the appropriate credentials.
Tulip Table DataTulip Tables Page, Player, APIPostgresAccess controlled (authorised users)