For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
DocsAPI Reference
  • Fundamentals
    • Home
    • Why Skyflow?
    • Get started with Skyflow
    • Explore what Skyflow can do
      • Overview
      • Define your solution
      • Build your solution
      • Go live
    • Authenticate
    • Accounts and environments
    • Deployment models
    • Security best practices
    • Compliance and certifications
    • Get data into Skyflow
    • Platform FAQs
  • Governance
    • Overview
  • Tokenization
    • Overview
  • Connections
    • Overview
  • Processing
  • AI and data sanitization
    • Overview
  • SDKs
    • Overview
  • Elements
    • Collect and reveal data with Skyflow Elements
LogoLogo
Login
Login
On this page
  • Assess your data requirements
  • Set up your environment
  • Define your vault schema
  • Next steps
FundamentalsBuild and go live with Skyflow

Define your solution

Was this page helpful?
Previous

Build your solution

Next
Built with

The first stage of building your solution with Skyflow is defining your solution. During this stage, you identify your usage patterns and define the schema for your vault based on how Skyflow fits into your existing systems/architecture.

Assess your data requirements

The first step to implementing your Skyflow solution is understanding your usage patterns and architecture.

Consider the following resources based on your needs and usage patterns:

Protect PII

Protect personally identifiable information (PII) such as names, addresses, and social security numbers.

Protect PCI

Protect payment card information (PCI) such as credit card numbers and CVV codes.

Protect LLMs

Protect sensitive data in large language models (LLMs).

Protect Data Analytics

Protect sensitive data in analytics and machine learning.

Protect Data Residency

Protect sensitive data in different geographic locations.

Once you’ve defined your use cases and the flow of your data, define your scope and integration timeline.

Set up your environment

Next, access your Sandbox environment and start exploring Skyflow Studio.

After receiving a welcome email from Skyflow, confirm the setup details for your Sandbox environment. Once completed, you can access your Sandbox environment, where you can develop and test a production-ready integration with Skyflow.

Log in to Skyflow Studio, the central management tool for your Skyflow account, to configure and manage all aspects of your Skyflow deployment, including users and permissions.

Accounts and Environments

Learn about your Skyflow accounts and environments.

Define your vault schema

Your vault schema specifies the tables and columns you can use to store data, and defines how you can integrate data from your existing systems.

Defining your schema requires identifying your present and future data requirements, creating a vault, and configuring how to store your data:

  1. For each type of sensitive data you are storing, consider any compliance requirements like data retention durations and deletion requests.

  2. Map the flow of data through your system:

    • What are all of the entry and exit points for data in your system?
    • Which systems exist between the entry and exit?
    • Which of those systems are exposed to your data and could cause leakage?
    • Do those systems forward your data anywhere?
  3. Review your data flow and determine how Skyflow integrates into your architecture:

    • How is data being sent to your vault?
    • Is any data displayed back to the user on the front end?
    • How will data be read/consumed from your vault?
    • Will any data be sent from your Skyflow Vault to a third party?
  4. Create a Vault via Studio or the Management API. If you haven’t defined your necessary tables and columns yet, you can start with a template and modify the schema later.

    You can’t modify some column settings after you add data to your vault.

  5. Define the columns for your vault schema based on your data flow and existing architecture. Skyflow provides two ways to add fields when creating columns in a vault:

    • Skyflow Data Types: Pre-configured types for common PII elements with settings for input validation, redaction, tokenization, encrypted operation support, and more.
    • Basic Data Types: Standard database types like integers and strings.
  6. Skyflow columns support multiple configuration options that define how data is stored, protected, and accessed. These settings determine a column’s behavior and properties such as validation, tokenization, redaction, and encryption.

    Review the following column settings in Studio or using the tags in the Management API (see vault settings):

    • Basic properties for the column such as name, description, tags, column group, data type, array, regex validations, uniqueness, and required status.

    • Tokenization substitutes sensitive values with non-sensitive tokens. Determine whether tokenization should be needed for a column and the appropriate tokenization format.

      Persistent tokens are for long-lived values and remain in your vault until you delete them. These tokens can be formatted to be random (generates a different token for every instance of a value) or consistent (generates the same token for every instance of a value). Additionally, you can enable format-preserving tokenization to replace the sensitive values in your database without changing the schema or any validation rules, as the token would appear in the same format as the plain text data.

      Note: You can define the format you want to use for the generated tokens. If you don’t define the regex, Skyflow attempts to infer the format of the provided value.

      Transient tokens are for temporary storage of sensitive values, which are automatically deleted after a predefined Time to Live (TTL). Updating a transient field with a new value resets the TTL, but if the token expires, detokenization requests fail.

      Security tip: Consider using transient tokenization if your use case involves short-term storage of highly sensitive values, such as credit card information (refer to CVV best practices).

      You can also configure a column to not use tokenization.

    • Redaction settings control how sensitive data is obscured to prevent unauthorized access. Full redaction returns values as “REDACTED”, partial redaction applies regex mask policy and formats to values, and no redactions returns values in plain text.

      Make sure all columns that contain sensitive data have some level of redaction applied.

    • Determine how your data should be encrypted when stored in your Skyflow vault. Disk-level encryption keeps data encrypted at rest and is always enabled. Column-level encryption enables querying functionality. More specifically, encrypted operations enable exact match queries, and encrypted integer operations (Aggregation and Comparison) enable queries with mathematical operations (SUM, AVERAGE, <, >, sort by).

      Indexing your columns enables faster data retrieval.
Blog: How to Keep Sensitive Data Out of Your Logs: 9 Best Practices

Best practices to keep sensitive data out of your logs during development.

Create a vault schema

Create a vault schema in Studio or using the Management API.

Blog: What is Tokenization? What Every Engineer Should Know
Blog: Understanding Tokenization vs. Encryption and When to Use Them

When you create your schema, plan for future expansion and use cases. Once you create a vault, you can always add new columns and edit select column settings. However, once data is in a column, certain settings are fixed and cannot be modified later such as tokenization options, column-level encryption, indexing and some general settings (column group, data type and array). If you’ve already inserted some test data, you can delete records in Studio or via the Delete Record API.

CategorySettingCan always changeCan change if column is emptyCan change if table is emptyCannot changeNotes
GeneralName✓
Description✓
Tags✓
Column Group✓
Data Type✓
Array✓
Regex Validation✓*Can be updated only in an additive manner.
Uniqueness✓
Required✓
TokenizationTokenization Type✓
RedactionRedaction Type✓
EncryptionDisk-level Encryption✓Always enabled.
Column-level Encryption✓
Encrypted Operations✓
Indexing✓

Tabular summary of modification conditions for column settings.

Some key considerations for schema flexibility:

  • Choose between random and consistent tokens based on your analytics and matching needs. Additionally, decide whether you need format-preserving tokens.
  • Consider how data will be queried and accessed from your vault when setting up encrypted operations to make sure that your schema supports the operations you expect to perform, such as exact matching, sorting, or aggregations.
  • Group related columns across tables that need consistent token generation. Columns in a group must share the same data type, tokenization type, and validation rules.
  • Plan for data deletion requirements, especially across column groups and determine if any data needs temporary storage using transient fields.

If needed, Skyflow can assist in reviewing your schema and provide architecture recommendations to ensure smooth integration.

Next steps

Once you design your solution and the schema for your vault, you’re ready to start building your solution.